rkowen@violet.berkeley.edu (Dario Bressanini) (04/10/91)
>> Consider for a moment (yes, these are not equivalent): >> >> x = ++c; vs x == c++; >> These can be "compiled" as: >> >> temp_001 <-- c; >> c <-- c + 1; c <-- c + 1; >> x <-- c; x <-- temp_001; >> >> (now throw away the "x=" part (last "instruction"). >> >>So, "++c" is ``cleaner'' in some pedantic sense[1], and I suppose a >>sufficiently lacking compiler might actually produce slower code >>for "c++;" than for "++c;". ^^^^^^^^^^^^^^^^^^^^ I always read this group with interest, i have learned a lot following many discussions, but sometimes the "war" "This statement is faster than the other" comes up, like in this case. I don't want to discuss this particular case, but to make some general personal observations. Based on my experience, I hardly believe that in a "real world code" one can improve the performances of a program using this kind of "optimization". When i REALLY HAVE to optimize a program, first of all i use a profiler to see where is the bottleneck, and THEN i try to optimize it; probably I am biased since i mostly write programs (90% in FORTRAN) for scientific computation (yes, I use floating points :-) ) where usually you have a single routine, or even a single loop, that takes 95% of the CPU time. In most cases the best way to gain speed was to change completely the algorithm, and not to make some very low level optimization. Following the latest "this is faster than that" wars I had the impression that they were pure void theoric discussions, without any connection with the "real world", at least to my world. Just in case.... I don't want to start the usual and useless war C vs FORTRAN etc..,i would like to use C for my programs, but in most cases i tried, the Code produced by the C compiler was 2 or 3 times slower that the fortran code. dario bressanini
juul@diku.dk (Anders Juul Munch) (04/11/91)
rkowen@violet.berkeley.edu (Dario Bressanini) writes: >>> Consider for a moment (yes, these are not equivalent): >>> >>> x = ++c; vs x == c++; Of course you mean x = c++; - right? >>> These can be "compiled" as: >>> >>> temp_001 <-- c; >>> c <-- c + 1; c <-- c + 1; >>> x <-- c; x <-- temp_001; >>> >>> (now throw away the "x=" part (last "instruction"). >>> >>>So, "++c" is ``cleaner'' in some pedantic sense[1], and I suppose a >>>sufficiently lacking compiler might actually produce slower code >>>for "c++;" than for "++c;". ^^^^^^^^^^^^^^^^^^^^ >I always read this group with interest, i have learned a lot following >many discussions, but sometimes the "war" "This statement is faster >than the other" comes up, like in this case. I don't want to discuss >this particular case, but to make some general personal observations. >Based on my experience, I hardly believe that in a "real world code" >one can improve the performances of a program using this kind >of "optimization". >[stuff deleted] You're probably right, this is hardly a worth-while optimization. He does have a point, though, in that post-increment/decrement is a rather clumsy construct from a compiler point of view. As long as it is such a simple statement, any compiler can sort out how to do it the fastest way, anyway. But when the increment is in the middle of some assignment or array index, it may not be able to do that. I once read saw an implementation of quicksort in a magazine, which contained code like: array[index++] = something; They stated it was written that way for efficiency. I tried compiling that code, together with the rewrite array[index] = something; index++; And then I looked at the code that my "real-world" compiler (Turbo C++ 1.0) generated for these two code fragments - Identical! The morale of the story is, don't use increment/decrement-operators within other expressions - it makes the code less readable and it's probably doesn't give you any performance gain anyway. - Anders Munch, juul@diku.dk /------------------------------------------------------\ | (Insert your disclaimer of choice here) | | | \------------------------------------------------------/
andand@cia.docs.uu.se (Anders Andersson) (04/15/91)
In <1991Apr9.213601.12309@agate.berkeley.edu> rkowen@violet.berkeley.edu (Dario Bressanini) writes: >>> Consider for a moment (yes, these are not equivalent): >>> >>> x = ++c; vs x == c++; >>> These can be "compiled" as: >>> >>> temp_001 <-- c; >>> c <-- c + 1; c <-- c + 1; >>> x <-- c; x <-- temp_001; >>> >>> (now throw away the "x=" part (last "instruction"). >>> >>>So, "++c" is ``cleaner'' in some pedantic sense[1], and I suppose a >>>sufficiently lacking compiler might actually produce slower code >>>for "c++;" than for "++c;". ^^^^^^^^^^^^^^^^^^^^ >I always read this group with interest, i have learned a lot following >many discussions, but sometimes the "war" "This statement is faster >than the other" comes up, like in this case. I don't want to discuss >this particular case, but to make some general personal observations. >Based on my experience, I hardly believe that in a "real world code" >one can improve the performances of a program using this kind >of "optimization". >When i REALLY HAVE to optimize a program, first of all i use >a profiler to see where is the bottleneck, and THEN i try to optimize it; >probably I am biased since i mostly write programs (90% in FORTRAN) >for scientific computation (yes, I use floating points :-) ) where usually >you have a single routine, or even a single loop, that takes 95% of >the CPU time. >In most cases the best way to gain speed was to change completely >the algorithm, and not to make some very low level optimization. >Following the latest "this is faster than that" wars I had >the impression that they were pure void theoric discussions, without any >connection with the "real world", at least to my world. >Just in case.... I don't want to start the usual and useless war >C vs FORTRAN etc..,i would like to use C for my programs, but in most >cases i tried, the Code produced by the C compiler was 2 or 3 times >slower that the fortran code. To me it seems you are contradicting yourself. First you tell us that you only find algorithmic improvement worthwhile, then you complains about slow code produces by C compared to Fortran. As you are probably well aware, the speed difference you noticed is very dependent of computer and compiler, but a few observations may be of interest: You cannot do much low-level optimization yourself in Fortran, right? So, Fortran compilers tend to have very clever optimizers, especially when the program does number crunching which often is quite easy to handle for an optimizer. C, on the other hand, allows the programmer a lot more freedom to tune his or her program himself, using register variables, pointer arithmetic and other interesting stuff. This freedom gives optimizers a hard time (especially pointers is a big problem, since a optimizer cannot assume that certain values can be kept in registers if a indirect assignment can change them..) Fortran may therefore have an edge on number crunching, but I doubt it holds in general (not that many successful operating systems, compilers, word processors or whatever are written in Fortran (at least not that I know of, no need to flame)) Further, were the C versions of the programs you compared tuned by somebody that is used to doing the tuning ? Were they tuned at all ? If you want maximum speed you have to do pointer arithmetic rather than array indexing on many machines/compilers. (Since you can do this yourself in C, most optimizers don't bother with it, while Fortran optimizers virtually allways makes this for you). When converting from Fortran you also have to consider the column-major and row-major storage used for two dimensional arrays and so forth. Most of the time improvement of the algorithm is preferable, but when you have the best known algorithm and your program is still slobbish what do you do then ? Turn to assembly language ? I have found low-level optimizations of the discussed kind most helpful in tight loops, sometimes speeding it up with a factor of two or so. The only problem is that many such optimizations are machine dependent (sometimes even compiler dependent) for example: the MC68000-series has indirect addressing modes with post-increment and pre-decrement, corresponding to C expressions like *(p++) and *(--p) respectively. There are no pre-increment or post-decrement addressing modes, however. Knowing this may speed up many loops when running on a 68k, but may not have any effect whatsoever on another machine (I bet there is some machine out there that has only pre-increment and post-decrement on which it will run slower...) -Anders Andersson (andand@cia.docs.uu.se)
andand@cia.docs.uu.se (Anders Andersson) (04/15/91)
In <1991Apr11.150858.4242@odin.diku.dk> juul@diku.dk (Anders Juul Munch) writes: [ stuff deleted ] > I once read saw an implementation of quicksort in a >magazine, which contained code like: > array[index++] = something; >They stated it was written that way for efficiency. That's crazy. If you want efficiency in QSort you should use pointer arithmetic like: *(pointer++) = something; Then it makes a difference... OK, OK, not true an all machines but *absolutly* true on a MC600x0. (Yes, I've tried it!) >I tried compiling that code, together with the rewrite > array[index] = something; > index++; >And then I looked at the code that my "real-world" compiler >(Turbo C++ 1.0) generated for these two code fragments - > Identical! I can't imagine a compiler where they wouldn't be. (Unless there is some strange addressing modes on the machine). >The morale of the story is, don't use increment/decrement-operators >within other expressions - it makes the code less readable >and it's probably doesn't give you any performance gain anyway. The moral of my comment is, if you want speed you'd better not stop half way - go for pointer arithmetic and make your code truly unreadable. (But run a profiler first... You might not want it everywhere...) [ name and disclamer deleted ] -Anders Andersson (andand@cia.docs.uu.se)
gwyn@smoke.brl.mil (Doug Gwyn) (04/16/91)
In article <andand.671719926@cia.docs.uu.se> andand@cia.docs.uu.se (Anders Andersson) writes: -In <1991Apr9.213601.12309@agate.berkeley.edu> rkowen@violet.berkeley.edu (Dario Bressanini) writes: ->Just in case.... I don't want to start the usual and useless war ->C vs FORTRAN etc..,i would like to use C for my programs, but in most ->cases i tried, the Code produced by the C compiler was 2 or 3 times ->slower that the fortran code. - As you are probably well aware, the speed difference you noticed is very -dependent of computer and compiler, but a few observations may be of interest: -You cannot do much low-level optimization yourself in Fortran, right? -So, Fortran compilers tend to have very clever optimizers, especially when -the program does number crunching which often is quite easy to handle for an -optimizer. I don't think that is the proper explanation of the observed phenomenon. For any commercial-quality system, I would be amazed if an equally competent implementation of nearly any algorithm in Fortran and C gave much edge to Fortran. The one possibility for a significant speed difference of which I am aware is the requirement for pre-ANSI C that all floating-point computations be performed in double precision, even if the variables are declared as single-precision. Standard C discards that old requirement.
jlg@cochiti.lanl.gov (Jim Giles) (04/16/91)
>> array[index++] = something; >> [...] > > That's crazy. If you want efficiency in QSort you should use > pointer arithmetic like: > > *(pointer++) = something; If your compiler does something which makes the first slower than the second, it's time to get a new compiler. At some time in the code you had to do something like: pointer = &array[1]; or even pointer = array; The compiler should be able to perform the same operation outside any loops and achieve the same level of optimization. The user's code needn't contain such things. J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/17/91)
[...] For any commercial-quality system, I would be amazed if an equally competent implementation of nearly any algorithm in Fortran and C gave much edge to Fortran. [...] We've been through this again and again. C can't hold a candle to Fortran for array manipulation because Fortran is free to assume that array arguments to procedures are _not_ aliased to each other or to globals. C must assume that all pointers are possibly aliased to each other, to globals, and to any locals to which the & operator has been applied. There is not a single significant optimization technique that is not inhibited to some extent or other by aliasing. Some C implementations have local extensions (using #pragma statements) that allow the programmer to declare that certain variables are not aliased - all very non-portable. Some experimental systems allow the compiler to do restricted types of interprocedural analysis to detect aliasing - this is non-standard since it violates separate compilation constraints. I've never seen a purely standard conforming C compiler that can come close to Fortran. J. Giles
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (04/17/91)
In article <21527@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: > We've been through this again and again. C can't hold a candle to > Fortran for array manipulation because Fortran is free to assume that > array arguments to procedures are _not_ aliased to each other or to > globals. Except on vector machines, Fortran array code can be converted into C pointer code which will run at the same speed *without* the optimizer that Fortran requires. Aliasing does not inhibit this (usually rather mechanical) translation. On vector machines aliasing is a problem, but they provide adequate directives to control aliasing. On the other hand, no matter how powerful your Fortran optimizer is, there will be a pointer program P for which any array code translation will be SLOWER. This is a simple application of the halting problem. Most importantly, that pointer program P comes up frighteningly often in practice. Looks to me like C wins here. > There is not a single significant optimization > technique that is not inhibited to some extent or other by aliasing. Hand optimization. Try it, and your programs will run a lot faster for relatively little effort. > Some experimental systems allow the > compiler to do restricted types of interprocedural analysis to detect > aliasing - this is non-standard since it violates separate compilation > constraints. The second half of that statement is incorrect. The compiler must act AS IF the code had been separately compiled. This in no way prohibits interprocedural optimizations. Read the standard (or any of the available elementary introductions to it) to learn about the as-if rule. ---Dan
gwyn@smoke.brl.mil (Doug Gwyn) (04/17/91)
In article <21527@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >We've been through this again and again. Indeed. >C can't hold a candle to Fortran for array manipulation because >Fortran is free to assume that array arguments to procedures are >_not_ aliased to each other or to globals. C must assume that all >pointers are possibly aliased ... There is not a single significant >optimization technique that is not inhibited to some extent or other >by aliasing. The latter is a gross overgeneralization, and in any event the assertion is simply not true. There are SOME circumstances under which aliasing must be assumed to be possible, but by no means ALL circumstances. I wasn't particularly referring to array-intensive applications anyway, because most interesting "scientific" applications that I have encountered require more flexibility than the use of fixed arrays provides. Each language should be dealt with on its own terms, not treated as an inferior substitute for one's favorite language. >I've never seen a purely standard conforming C compiler >that can come close to Fortran. Well, I have.
jlg@cochiti.lanl.gov (Jim Giles) (04/17/91)
In article <15870@smoke.brl.mil>, gwyn@smoke.brl.mil (Doug Gwyn) writes: |> In article <21527@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: |> [...] C must assume that all |> >pointers are possibly aliased ... There is not a single significant |> >optimization technique that is not inhibited to some extent or other |> >by aliasing. |> |> The latter is a gross overgeneralization, and in any event the |> assertion is simply not true. There are SOME circumstances under |> which aliasing must be assumed to be possible, but by no means ALL |> circumstances. All array arguments to a procedure must be assumed aliased to each other and to any globals of the same base type. Indeed, due to the fact that most C programmers violate the rules prohibiting type casts of pointers to different types, all arrays should be assumed aliased to globals and other pointers of _any_ type. Any locally declared pointers which are assigned to the any of the above (pointer args, and/or the addresses of globals) must also be assumed aliased to these things. |> [...] |> I wasn't particularly referring to array-intensive applications |> anyway, because most interesting "scientific" applications that |> I have encountered require more flexibility than the use of fixed |> arrays provides. Each language should be dealt with on its own |> terms, not treated as an inferior substitute for one's favorite |> language. Then you shouldn't make unqualified statements that are not true. Your claim was that C was inherently as fast or faster than Fortran for all operations (you made the exception of _old_ C implementations that forced double computation instead of float). To now say that you weren't including array-intensive applications is a little late. Most of the array calculations I have encountered require more flexibility than pointers provide - like the flexibility to declare two things _not_ to be aliased. |> [...] |> >I've never seen a purely standard conforming C compiler |> >that can come close to Fortran. |> |> Well, I have. Really? I'm sitting on the edge of my chair. Where can I learn more about this paragon? It must use compilation techniques that are not described in any of the existing literature since I have seen verified to my satisfaction that 1) aliasing _does_ inhibit _all_ major optimization techniques, and 2) most uses of C pointers are either aliased in truth, or are not distinguishable (by the compiler) from aliased ones. The only way around this difficulty is interprocedural analysis - but C requires separate compilation. J. Giles
mhcoffin@tolstoy.waterloo.edu (Michael Coffin) (04/18/91)
In article <21660@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >The only way around this difficulty [detecting aliasing] is >interprocedural analysis - but C requires separate compilation. Where in heaven's name did you get the idea that separate compilation precludes interprocedural analysis? Ken Kennedy's group at Rice U. have for years been doing extremely fancy interprocedural analysis of Fortran programs and simultaneously providing separate compilation. Take a look at the work done on PFC back in the late 70's and early 80's. They operate under the same constraints that C demands---that object code has to repect the semantics of the language, regardless of what analysis is done. (Note, by the way, that Fortran compiler writers seem to find interprocedural analysis necessary for producing efficient code from Fortran in spite of the rules against aliased parameters.) -mike
jlg@cochiti.lanl.gov (Jim Giles) (04/18/91)
In article <1991Apr17.190243.24691@watmath.waterloo.edu>, mhcoffin@tolstoy.waterloo.edu (Michael Coffin) writes: |> [...] |> Where in heaven's name did you get the idea that separate compilation |> precludes interprocedural analysis? [...] Separate compilation and interprocedural analysis are (in the abstract) mutually exclusive. You can, of course, adopt some intermediate approach in which _some_ interprocedural information is available at compile time. To the extent that you do this, you do _not_ have separate compilation. The only information available to the compiler about procedures that are separately compiled are the names of routines that the current procedure calls and the number and type of their arguments and results (if any). For a language which requires separate compilation, _NO_OTHER_INFORMATION_ is known about an external procedure. The reason is that _all_ other characteristics of an external procedure are allowed to _change_ without the need to recompile any uses of that procedure. The bottom line is that separate compilation implies that procedures may be used together for which the _source_ of the separate routines has never even existed in the same programming environment and only the object code has been distributed. Further, even the object code of each may not be available at the time each of the other procedures is compiled. Hence, under separate compilation there may be no internal information about other procedures used when a given procedure is compiled because such information may not exist, may be speculative, or may subsequently change without notice. To be sure, a great deal of research has been done on the consequences of various compromises to separate compilation. Some techniques require that more information about externals be made available through interface specs or header files. Some techniques move some of the compilation process into later tools (like the loader). Indeed, by moving the code generation and optimization phases of compilation into the loader, you can have the benefits of full interprocedural analysis while technically remaining compatible with language specifications which require separate compilation. In fact, such a loader could 'inline' separately compiled procedures, do constant folding, etc. across procedure boundaries and other amazing things. Hypothetically, such an environment could optimize the infamous "hello word" test code into a single system call. _I_ have not _seen_ production quality compiler/loader combinations that do this. J. Giles
henry@zoo.toronto.edu (Henry Spencer) (04/18/91)
In article <21660@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >... The only >way around this difficulty is interprocedural analysis - but C requires >separate compilation. And the top traffic light has to be red - but blueberries are blue. The two parts of that statement have about as much to do with each other as in yours. There is no shortage of standard-conforming C compilers that do vigorous interprocedural analysis if asked to; the ones from MIPS, for example. Even actual separate compilation does not actually preclude interprocedural analysis, although it certainly makes it harder. But the problem at hand is nowhere near that hard, since you can simply decline to provide both capabilities simultaneously. You do interprocedural analysis across all source files supplied to a single invocation of the compiler, and if the user is supplying some modules as object files, the interprocedural analysis simply doesn't extend over their boundaries. -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry
gwyn@smoke.brl.mil (Doug Gwyn) (04/18/91)
In article <21660@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes:
-All array arguments to a procedure must be assumed aliased to each
-other and to any globals of the same base type.
No, this is not true (assuming you are talking about pointer
arguments; C does not have array arguments). As I and others
have recently told you, sometimes this must be assumed and
other times it need not be, depending on context.
-Indeed, due to the fact that most C programmers violate the rules
-prohibiting type casts of pointers to different types, all arrays
-should be assumed aliased to globals and other pointers of _any_
-type.
We know you're a Fortran bigot, but that's ridiculous.
-|> I wasn't particularly referring to array-intensive applications
-|> anyway, because most interesting "scientific" applications that
-|> I have encountered require more flexibility than the use of fixed
-|> arrays provides. Each language should be dealt with on its own
-|> terms, not treated as an inferior substitute for one's favorite
-|> language.
-Then you shouldn't make unqualified statements that are not true.
-Your claim was that C was inherently as fast or faster than Fortran
-for all operations (you made the exception of _old_ C implementations
-that forced double computation instead of float). To now say that you
-weren't including array-intensive applications is a little late.
You have this annoying habit of pretending I said things that I didn't.
I didn't EXCLUDE array operations in the statement sited above, and I
didn't say that C was inherently as fast as Fortran for all operations.
What I said was:
->>> I don't think that is the proper explanation of the observed phenomenon.
->>> For any commercial-quality system, I would be amazed if an equally
->>> competent implementation of nearly any algorithm in Fortran and C gave
->>> much edge to Fortran. The one possibility for a significant speed
->>> difference of which I am aware is the requirement for pre-ANSI C that
->>> all floating-point computations be performed in double precision, even
->>> if the variables are declared as single-precision. Standard C discards
->>> that old requirement.
I don't see any point in continuing to argue with you, any more than
the last time you stuck your nose into the C newsgroup to run down the
language and promote Fortran for everything. If you want a more complete
picture of the "aliasing problem" in C, you should read some of the
papers from NCEG. "A little knowledge is a dangerous thing."
mhcoffin@tolstoy.waterloo.edu (Michael Coffin) (04/18/91)
In article <21703@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >For a language which requires >separate compilation, _NO_OTHER_INFORMATION_ is known about an >external procedure. The reason is that _all_ other characteristics >of an external procedure are allowed to _change_ without the need to >recompile any uses of that procedure. The C standard doesn't say Thou Must Compile Separately. It says that the meaning of the program---roughly speaking, the output obtained by running the program---doesn't depend on information gleaned from intermodule analysis. Nothing in the standard says that a compiler can't take advantage of all available information to produce good code. -mike
jlg@cochiti.lanl.gov (Jim Giles) (04/18/91)
In article <1991Apr17.225944.15261@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) writes: |> [...] You do interprocedural |> analysis across all source files supplied to a single invocation of the |> compiler, [...] And, if you subsequently change _some_ of those source files? What then? Under this model, you'd have to recompile _all_ the others that were originally compiled together. This violates one of the principle reasons to have separate compilation to begin with - the elimination of the needto recompile most of your code every time you change a piece of it. Note: I'm not arguing that what you suggest is a bad idea. I think personally that what you've said here is a good way to improve performance. I'm just pointing out that it _does_ violate the standards of both Fortran and C which are defined to be separately compiled. There are a lot of _non-standard_ ways to improve performance. But, it is always a mistake to allow someone to claim that any of these are really standard after all.` Finally, I don't have access to any production quality compilers which do interprocedural analysis for either Fortran or C. Fortran doesn't need the feature as much as C does. J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/19/91)
In article <15881@smoke.brl.mil>, gwyn@smoke.brl.mil (Doug Gwyn) writes: |> [...] As I and others |> have recently told you, sometimes this must be assumed and |> other times it need not be, depending on context. You can tell me that until you are blue in the face. I have never seen any of these hypothetical contexts in which anonymous pointer arguments passed in as parameters can be safely assumed not to be aliased. Please give an example. Please state why you consider the example be representative of a common case in C. Since I have never seen such, I suspect that if an example exists it is _very_ obscure. |> [... aliasing of pointers to mixed type ...] |> |> We know you're a Fortran bigot, but that's ridiculous. What's rediculous about it? Even the C standard says it's illegal. Why can't I complain about it? It is a _very_ common cause of errors. Finally, what has this particular complaint have to do with Fortran? I would have made the same comment if I had been comparing C to _any_ well designed language (Pascal, Ada, Scheme, ...). J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/19/91)
> [...] > The C standard doesn't say Thou Must Compile Separately. [...] No. As far as I know, it doesn't use the word "Thou" at all. The standard _does_ say: A C program need not all be translated at the same time. The text of a program is kept in units called _source_files_ in this standard, and previously translated files may be preserved individually or in libraries. The separate files of a program communicate by calls to functions whose identifiers have external linkage, and by manipulation of objects whose identifiers have external linkage. Translated files may be linked with previously translated libraries to produce an executable program. Note: the standard specifically says that files may be compiled in any order and that the only standard communication between files is through external variables and functions. J. Giles
henry@zoo.toronto.edu (Henry Spencer) (04/19/91)
In article <21812@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >|> [...] You do interprocedural >|> analysis across all source files supplied to a single invocation of the >|> compiler, [...] > >And, if you subsequently change _some_ of those source files? What >then? Under this model, you'd have to recompile _all_ the others >that were originally compiled together... Correct. Arranging to do this is pretty trivial with something like make. Of course, having to do it is a nuisance. Using some sort of intermediate form as the basis for interprocedural analysis could bypass *some* of the overhead, but having to reprocess all the relevant code to some extent is really rather fundamental to interprocedural optimization. >... I'm just pointing out that it _does_ violate the >standards of both Fortran and C which are defined to be separately >compiled... I can't speak for Fortran, but please cite chapter and verse out of X3.159 when you say this about C. You are, simply, wrong. You are confusing the characteristics of specific implementations with what is mandated by the standard. There is *nothing* in ANSI C which demands classical separate compilation or prevents interprocedural analysis. -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry
henry@zoo.toronto.edu (Henry Spencer) (04/19/91)
In article <21818@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: > ... Translated files may be linked with previously > translated libraries to produce an executable program. Note that there is no definition of what "translation" means. It could mean something as trivial as tokenizing the source and storing it for later processing. It does not have to mean classical compilation. >Note: the standard specifically says that files may be compiled >in any order and that the only standard communication between files >is through external variables and functions. No, it says that they may be *translated* in any order, which does not necessarily mean compilation. As for the communication, note the discussion in 2.1.2.3, which basically says that the implementation can do anything it pleases so long as the externally-visible results of running the program remain much the same. Portable user code cannot rely on other communication paths, but the implementation can use any information it can get. -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry
barmar@think.com (Barry Margolin) (04/19/91)
In article <21812@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >In article <1991Apr17.225944.15261@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) writes: >|> [...] You do interprocedural >|> analysis across all source files supplied to a single invocation of the >|> compiler, [...] > >And, if you subsequently change _some_ of those source files? What >then? Under this model, you'd have to recompile _all_ the others >that were originally compiled together. This violates one of the >principle reasons to have separate compilation to begin with - the >elimination of the needto recompile most of your code every time >you change a piece of it. When one is in a heavy edit-compile-test loop, one normally disables many of the optimizations that slow down the compile step. So, during this stage of development you probably wouldn't use the inter-module analysis option, and would only have to recompile the modules that you edited. When you get to the stage of compiling with all the optimization options, then you might have to recompile lots of files each time you change just one. Hopefully, by this time in the development such changes should be rare. > I'm just pointing out that it _does_ violate the >standards of both Fortran and C which are defined to be separately >compiled. There are a lot of _non-standard_ ways to improve >performance. But, it is always a mistake to allow someone to >claim that any of these are really standard after all.` I interpreted the section of the standard that you quoted in another posting as *allowing* separate compilation. This means that the programmer isn't forced to compile everything together in order to achieve the semantics specified by the language. However, this doesn't prohibit an implementation from providing extensions that compile a group of source files together as a batch. -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
jlg@lanl.gov (Jim Giles) (04/19/91)
In article <1991Apr18.190403.29049@Think.COM>, barmar@think.com (Barry Margolin) writes: > [...] > I interpreted the section of the standard that you quoted in another > posting as *allowing* separate compilation. This means that the programmer > isn't forced to compile everything together in order to achieve the > semantics specified by the language. [...] The standard _allows_ the _USER_ to decide whether to compile separately or together. In the standard, the word "may" applies to user choices, the corresponding constraint on implementations is that it _must_ work correctly when the user chooses to exercise the feature. In this case, if the user chooses only to recompile one file and not any of the others (which is allowed by the "may" clause), the implementation _must_ allow this to be linked with the previously translated files to produce a program. Sorry, those are the rules. These ANSI standards describe capabilities of the language - the requisite properties of an implementation must be inferred from that. And, as I said, if the standard says you _may_ do something, the implementation _must_ permit you to do it. J. Giles
jlg@lanl.gov (Jim Giles) (04/19/91)
In article <1991Apr18.185442.7546@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) writes: > [...] > Note that there is no definition of what "translation" means. It could > mean something as trivial as tokenizing the source and storing it for > later processing. It does not have to mean classical compilation. Echoing my own previous arguments is not going to get you anywhere. I previously have stated that if you move the "backend" processes of compilation into a post-compilation tool (like the loader), that you could do interprocedural analysis and still remain standard comforming. I also said that this is not what you were recomending, and that I had seen no production quality compiler/loader environments fitted to do this. If you want now to recommend this implementation as opposed to the non-conforming method of dependent compilation you were previously recommending, then i have no quarrel with you. J. Giles
cs450a03@uc780.umd.edu (04/19/91)
Joe English writes: [about inter-procedural optimization] >What happens, then, if you compile foo.c and bar.c together, and >later change and recompile bar.c alone? foo.o (or its equivalent) >might have been optimized to use information that's no longer valid. >Does the Standard say that it's OK for an implementation to require >recompilation of foo.c at this point? It might be worth pointing out that if a compiler was designed with the option of doing inter-procedural analysis, and if that option were used, there would be no need to have generated a "foo.o" in the first place. Note that I'm not saying that there would be no intermediate files, just that they should not be confusable with intermediate files used for independent compilation. Also note that I'm not saying that the compiler couldn't generate "foo.o" -- maybe somebody would throw in a switch to gratuiously create independent object files even though compiling with inter-procedural optimization turned on. Finally, note that I know of no such compiler, and I'm not about to write one. Raul Rockwell
henry@zoo.toronto.edu (Henry Spencer) (04/19/91)
In article <21848@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >If you want now to recommend this implementation as opposed to the >non-conforming method of dependent compilation you were previously >recommending, then i have no quarrel with you. Either method is fully, 100% conforming. There is nothing non-conforming about a compiler that does certain optimizations only in favorable situations, e.g. when it is compiling all source modules involved. It has to be able to cope with separately-translated files, but that does not require that it translate each and every file independently. It is perfectly entitled to notice that it is being asked to process more than one file, and to do its optimizations on all of them together rather than each one separately. The only constraint is that certain aspects of the external behavior of the resulting program must not vary with such optimizations. -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry
henry@zoo.toronto.edu (Henry Spencer) (04/19/91)
In article <21846@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >if the user chooses only to recompile one file and not any of the others >(which is allowed by the "may" clause), the implementation _must_ allow >this to be linked with the previously translated files to produce a >program. Of course, it may not run as fast as if he had compiled them all together. -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry
jlg@cochiti.lanl.gov (Jim Giles) (04/19/91)
In article <1991Apr18.233807.19552@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) writes: |> In article <21846@lanl.gov> jlg@lanl.gov (Jim Giles) writes: |> >if the user chooses only to recompile one file and not any of the others |> >(which is allowed by the "may" clause), the implementation _must_ allow |> >this to be linked with the previously translated files to produce a |> >program. |> |> Of course, it may not run as fast as if he had compiled them all together. I see that I'm going to have to give a specific example. Suppose that functions A and B are in two separate files. Suppose that I compile them 'together' in some way (that is, the compiler makes some use of internal knowledge of B while translating A and vice-versa). Now, suppose I _change_ A and retranslate just A (I do _not_ retranslate B). The standard requires that I still be able to link both into a single program. If the translation of B made any _profitable_ use of internal information about A, then that information is obsolete (and probably downright _wrong_) after A has been changed. Yet, the standard requires that the new version of A still be usable with the previously translated B. Either the use the translator made of the interprocedural analysis was trivial, or B will contain unsafe optimizations - the only third way is if 'translation' does not include code generation (which will subsequently be done by some post-translation tool: like the loader - just like I keep saying). I don't understand why you continue to hang on like grim death to the fiction that dependent compilation satisfies the requirements of the standard. J. Giles
jeenglis@alcor.usc.edu (Joe English) (04/19/91)
henry@zoo.toronto.edu (Henry Spencer) writes: > There is nothing non-conforming >about a compiler that does certain optimizations only in favorable situations, >e.g. when it is compiling all source modules involved. It has to be able to >cope with separately-translated files, but that does not require that it >translate each and every file independently. It is perfectly entitled to >notice that it is being asked to process more than one file, and to do its >optimizations on all of them together rather than each one separately. What happens, then, if you compile foo.c and bar.c together, and later change and recompile bar.c alone? foo.o (or its equivalent) might have been optimized to use information that's no longer valid. Does the Standard say that it's OK for an implementation to require recompilation of foo.c at this point? (My *guess* is that, yes, an implementation is allowed to say "you must recompile foo.c" or even generate an incorrect program in a case like this; as far as I know the C Standard has very little to say about the programming environment as far as revision control is concerned. Ada has rules to handle this sort of situation, but C doesn't seem to. Then again, I don't own, and have never even read, the ANSI C Standard so this is all conjecture.) >The >only constraint is that certain aspects of the external behavior of the >resulting program must not vary with such optimizations. If that's the case, and given that 'make' and similar utilities don't keep track of the compilation history, it would be extremely difficult to do any kind of cross-file interprocedural optimizations in a conforming C implementation for a conventional (e.g., Unixlike) environment. I have to agree (as painful as it is to do so) with Jim Giles on this point. --Joe English jeenglis@alcor.usc.edu Apologies in advance to Doug Gwyn et al. for stating an opinion about the Standard without having read it.
bhoughto@bishop.intel.com (Blair P. Houghton) (04/19/91)
In article <21812@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >In article <1991Apr17.225944.15261@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) writes: >|> [...] You do interprocedural analysis across all source files >|>supplied to a single invocation of the compiler, [...] > >And, if you subsequently change _some_ of those source files? What >then? Under this model, you'd have to recompile _all_ the others >that were originally compiled together. This violates one of the >principle reasons to have separate compilation to begin with - the >elimination of the needto recompile most of your code every time >you change a piece of it. ANSI X3.159-1989, sec. 2.1.2.3, p. 8, l. 30: "...issues of optimization are irrelevant." The only reason you need to recompile (in this situation, from the information given) is to provide the compiler enough information to optimize a certain subset of the syntax. If you want optimizations of this sort, you must: (a) use a compiler which provides this optimization; (b) write a makefile that indicates the dependencies inherent in this optimization; and, (c) run make(1) instead of cc(1). Neither are any of these things specified by the standard, nor are any of them prohibited to be used in concert with a conforming implementation of the C language. Basically, if you want to do something outside the realm of all sensibility, you may have to hold its hand. --Blair "And you'll _still_ spend less time each day recompiling source files than you do shaking your penis dry at the urinal."
steve@groucho.ucar.edu (Steve Emmerson) (04/19/91)
In <21868@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >I see that I'm going to have to give a specific example. Suppose >that functions A and B are in two separate files. Suppose that >I compile them 'together' in some way (that is, the compiler makes >some use of internal knowledge of B while translating A and vice-versa). >Now, suppose I _change_ A and retranslate just A (I do _not_ retranslate >B). The standard requires that I still be able to link both into a >single program. ... I don't believe the C Standard requires that a compiler capable of the above generate the separate "object" files A.o and B.o when they are somehow "compiled together". My understanding is that it could instead generate the "object" file AB.o. Thus, your problem can't occur: you can't link A.o against a re-compiled B.o *unless* you generated A.o separately as well. Steve Emmerson steve@unidata.ucar.edu ...!ncar!unidata!steve
bhoughto@bishop.intel.com (Blair P. Houghton) (04/19/91)
In article <21818@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >Note: the standard specifically says that files may be compiled >in any order and that the only standard communication between files >is through external variables and functions. They sure may, but why is it so horrible to you to document to the users of your optimized libraries that certain arguments to certain functions must be carefully non-aliased? That kind of restriction stopped nobody from specifying a standard function that defines and returns a static object (thus making more than one call to the function a risky construct, at best...) --Blair "How does Yogi eat all that honey right out of the tree without getting stung?"
bhoughto@bishop.intel.com (Blair P. Houghton) (04/19/91)
In article <21868@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >I don't understand why you continue to hang on like grim death >to the fiction that dependent compilation satisfies the requirements >of the standard. I haven't seen anyone state such a thing. What I have seen is plenty of statements that equate to the standard's requirement that a conforming program be translatable by a conforming implementation into a form that can be used to produce the output associated with a given input. It's just tough luck if this prevents you from expecting a certain optimization to be applied when you compile in strictly conforming mode. This has absolutely no effect on the form of the code in the source files. Exactly the opposite. It provides for the desired state where a program can be translated on any number of machines without alteration of the source files. If, however, it's provided you with extensions that allow you to produce odd effects, and you use them, then you are exceeding the bounds of the standard's influence, and whatever is required by the extension you must obey or it just won't work, apparently. This still has no effect on the source code. You haven't stated that special syntax must be provided in the source files in order to use this optimization you so desire, so there's no reason that my strictly conforming compiler that doesn't optimize in this way and your specialized compiler that does optimize in this way shouldn't be able to produce the same output. You're so hyped up about the fact that separate compilation prevents use of this optimization, but you haven't been so vocal about the case where you deliberately alias one of your arrays, which also blows your optimization. If you want to go out of your way, you have to go out of your way. --Blair "If the implementation also has knobby tires, a half-fairing, and a bottle of diethyl ether, then it's also an offroad crotch-rocket, but it better do the -ansi thing for `main(){printf("Hello, world.\n");}'"
barmar@think.com (Barry Margolin) (04/19/91)
In article <21846@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >The standard _allows_ the _USER_ to decide whether to compile separately >or together. In the standard, the word "may" applies to user choices, >the corresponding constraint on implementations is that it _must_ work >correctly when the user chooses to exercise the feature. In this case, >if the user chooses only to recompile one file and not any of the others >(which is allowed by the "may" clause), the implementation _must_ allow >this to be linked with the previously translated files to produce a >program. Once you invoke the implementation's extension that does inter-module optimization (why does everyone keep calling this interprocedural analysis? that usually refers to optimization across procedures within the same file), you have ventured outside the scope of the standard, just as if you had invoked a compiler option that enables some implementation-specific feature. Rather than referring specifically to files, the standard probably should have described the behavior in terms of "translation units", which could encompass multiple source files. This would have allowed multifile compilation to be defined in the standard, rather than making it an extension, and then the semantics would be covered by the section you quoted. -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (04/19/91)
Anyone who isn't sure about the issues involved here should keep the following in mind while reading Jim's articles: A standard-conforming compiler can include foo.c as part of foo.o. When it links foo.o and bar.o together to make a program, it still has foo.c and bar.c right there inside the object files, so it can do all types of interprocedural analysis. In practice, compilers don't include *all* of foo.c and bar.c---only some useful information about the procedure calls and variable use. This still conforms with the standard. Jim, I recommend that you look at the MIPS C compilers. ---Dan
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (04/19/91)
In article <21815@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: > You can tell me that until you are blue in the face. I have never seen > any of these hypothetical contexts in which anonymous pointer arguments > passed in as parameters can be safely assumed not to be aliased. > Please give an example. (``Anonymous pointer arguments''?) sort(table,records,n) int *table; struct blah *records; int n; This routine sorts n records, setting table[i] to the position of the i'th smallest record. This is, in fact, a very common type of sort, hardly a ``hypothetical'' or contrived example. I recommend that you read Knuth to learn about sorting and its many applications. Since neither ``int'' nor ``struct blah'' is ``char'' or ``void'', a standard compiler may assume that the pointers are not aliased. (I don't expect Jim to respond to this article; he always ignores counterexamples.) ---Dan
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (04/19/91)
In article <21703@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: > The only information available to the compiler about > procedures that are separately compiled are the names of routines > that the current procedure calls and the number and type of their > arguments and results (if any). This is incorrect. The standard makes no special allowance for ``the names of routines that the current procedure calls and [etc.].'' Each C source file has a *meaning* dependent solely upon the information in that file and its inclusions. Nevertheless, optimizations do not change that meaning and may depend upon other information. > For a language which requires > separate compilation, _NO_OTHER_INFORMATION_ is known about an > external procedure. This is incorrect. Again, the meaning of a source file depends solely upon that source file, prototypes in #includes, etc., but lots of other information may be known to the compiler. > The reason is that _all_ other characteristics > of an external procedure are allowed to _change_ without the need to > recompile any uses of that procedure. This is incorrect. Compilation consists of translation (like foo.c becoming foo.o) and loading (like foo.o and bar.o becoming a.out). The *translation* of one file is independent of other files, but the *loading* depends on all the files together. > The bottom line is that separate compilation implies that procedures > may be used together for which the _source_ of the separate routines > has never even existed in the same programming environment and only > the object code has been distributed. This is incorrect. The ANSI C standard requires separate compilation. It does not define ``the same programming environment'' or ``distributed.'' (As I pointed out in my previous article, a conforming compiler may produce object code that includes the entire source code, so Jim's statement is obviously wrong.) > Further, even the object code > of each may not be available at the time each of the other procedures > is compiled. This is incorrect. Compilation includes loading, and loading requires all object code for the program. It's true that the other objects may not be available at *translation* time, but optimizations aren't restricted to translation time. > Hence, under separate compilation there may be no internal > information about other procedures used when a given procedure is compiled > because such information may not exist, may be speculative, or may > subsequently change without notice. This is incorrect. The correct information may be carried along with each object file. ---Dan
jlg@cochiti.lanl.gov (Jim Giles) (04/19/91)
> I don't believe the C Standard requires that a compiler capable of the > above generate the separate "object" files A.o and B.o when they are > somehow "compiled together". My understanding is that it could instead > generate the "object" file AB.o. Thus, your problem can't occur: you > can't link A.o against a re-compiled B.o *unless* you generated A.o > separately as well. Well, my example was the other way around (I postulated retranslating A and not B), but I'll use you version just as well. The standard says that I _may_ link a fresh translaton of B with previously translated files (like A for example). It does _not_ say that I can only do this if A was not previously translated with an older version of B or anything to that effect. (Note again: anything the standard says I _may_ do, the implementation _must_ permit me to do.) To reiterate, the standard says I may link translated files to previously translated ones to form programs. It places no constraints on this. As long as each translated file is itself standard conforming, linking them together is a capability required by the standard. Now, it _does_ make linkage of translated units based on the definition of the units as "source files". So, I would read that to mean that interprocedural analysis and optimization _is_ allowed between functions defined in the _same_ source file. This is contrary to the standard practice of putting functions definitions into separate files as much as is possible. Further, I've not seen any compilers which take advantage of this method of optimization. In fact, Pascal did not even allow separate compilation and I never saw a Pascal compiler which did inter- procedural analysis either - compiler technology just hasn't quite got to to the point where such analysis is the commonplace of production compilers. When it does become commonplace, care must be exercised in applying it to C and still claiming standard conformance. J. Giles
henry@zoo.toronto.edu (Henry Spencer) (04/19/91)
In article <16692@chaph.usc.edu> jeenglis@alcor.usc.edu (Joe English) writes: >> There is nothing non-conforming >>about a compiler that does certain optimizations only in favorable situations, >>e.g. when it is compiling all source modules involved... > >What happens, then, if you compile foo.c and bar.c together, >and later change and recompile bar.c alone? foo.o (or its equivalent) >might have been optimized to use information that's no longer valid. Why do you assume that foo.o even exists? The standard requires that it exist at some point during the compilation of foo.c+bar.c, but it could be only a compiler-internal temporary, with foobar.o being generated as the result of the compilation. Actually, there might not even be a .o file, just the final a.out. (Remember that the standard allows things like C interpreters, so it is very carefully worded to place very few constraints on how the results of translation show up.) >Does the Standard say that it's OK for an implementation to >require recompilation of foo.c at this point? I would tentatively interpret that as being legitimate only if foo.o was not actually made available to the user. Otherwise you would really have to bend the rules to dodge around the "you can compile things separately and link them later" part. It's not actually breaking the rules, because all they say is that there must be *a way* to do it, and maybe the way is to compile files one at a time, but it's not going to endear the compiler writer to his customers. >>The >>only constraint is that certain aspects of the external behavior of the >>resulting program must not vary with such optimizations. > >If that's the case, and given that 'make' and similar utilities >don't keep track of the compilation history, it would be extremely >difficult to do any kind of cross-file interprocedural optimizations >in a conforming C implementation for a conventional (e.g., >Unixlike) environment... Why so? When I say "external behavior", I mean of the generated code, not of the compiler. There are any number of ways you can contrive to make things work, the simplest being the foobar.o approach that denies any opportunity to make them not work. You'd have to modify the Makefiles to feed such a compiler properly. It's often necessary to modify Makefiles for the sake of odd compilers. ANSI C does not place any constraints on that. :-) -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry
jlg@cochiti.lanl.gov (Jim Giles) (04/19/91)
> [...] (why does everyone keep calling this interprocedural analysis? > that usually refers to optimization across procedures within the same > file) [...] Oh. Quite so. Only the common practice of C programmers putting each function in its own separate file makes the mistake double easy to fall into. Not to mention the obvious fact that the implementation details of interprocedure and intermodule analysis would be almost identical from the implementors point of view. > Once you invoke the implementation's extension that does inter-module > optimization [...] > [...] you have ventured outside the scope of the standard, just as if you > had invoked a compiler option that enables some implementation-specific > feature. That's what I keep trying to point out. Thank heavens for someone finally joining the discussion who understands the issue. Further, I agree that the standard could have been just subtly reworded to eliminate the objection just stated. But, the fact remains that it was worded the way it was (and probably for reason - vendors would like to continue to be able to distribute only object code and if large pieces of the object code are interdependent, this would make incremental fixes and distributions of such much more difficult). J. Giles
dbrooks@osf.org (David Brooks) (04/20/91)
In article <21868@lanl.gov>, jlg@cochiti.lanl.gov (Jim Giles) writes: |> I see that I'm going to have to give a specific example. Suppose |> that functions A and B are in two separate files. Suppose that |> I compile them 'together' in some way (that is, the compiler makes |> some use of internal knowledge of B while translating A and vice-versa). |> Now, suppose I _change_ A and retranslate just A (I do _not_ retranslate |> B). The standard requires that I still be able to link both into a |> single program. This has never been the case. Using pre-ANSI syntax: B: some_routine(123); A: some_routine(arg) int arg; {... A after editing: some_routine(flag, arg) int flag, arg; {... -- David Brooks dbrooks@osf.org Systems Engineering, OSF uunet!osf.org!dbrooks Donnie Wahlberg is brought to you by three billion years of evolution.
steve@groucho.ucar.edu (Steve Emmerson) (04/20/91)
In <21961@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >To reiterate, the standard says I may link translated files to previously >translated ones to form programs. It places no constraints on this. As >long as each translated file is itself standard conforming, linking them >together is a capability required by the standard. Hmm... it appears you didn't understand what I was trying to say so I must not have stated it clearly enough. Let me try again. In my opinion the following two scenarios do not violate the standard and yet allow for global-optimization *and* incremental compilation (though not necessarily at the same time). Scenario #1 (demonstration of global-optimization): $ cc -O3 -o foobar.o -c foo.c bar.c $ cc foobar.o Scenario #2 (demonstration of the *required* functionality of incremental compilation): $ cc -O -c foo.c $ cc -O -c bar.c $ cc foo.o bar.o Since, in the above, one can't do both global-optimization and incremental compilation at the same time, I suppose one could argue that the above implementation doesn't conform to the standard (at least when performing global-optimization). This might be true, but I consider it unimportant because, to me at least, the important issues are 1) whether or not the *language* disallows global-optimization (it does not); and 2) whether or not the standard *forbids* behavior or capabilities outside its scope (it cannot). I'll second Barry's sugestion about the MIPS C compiler. Steve Emmerson steve@unidata.ucar.edu ...!ncar!unidata!steve
jlg@cochiti.lanl.gov (Jim Giles) (04/20/91)
It has been suggested that a program of the following type constitutes a counterexample to my previous statements about pointers and aliasing: func(i,j) char *i; float *j; ... However, my original statement about pointers and aliasing was that pointer arguments to the _same_ underlying type must be assumed to be aliased to each other and to all globals of that type (as well as any local object of that type whose address has possibly been assigned to one of these visible pointers, etc.). I admit that I also (and distinctly separately) mentioned that a prudent C compiler would also assume aliasing in many other cases because of the common practice that C programmers seem to covet of casting pointers in illegal ways. It is only in that sense that the above type of program is even relevant to anything I have been saying. In fact, I do believe that a prudent C implementation should assume that i and j in the above are aliased. But, since such aliasing is illegal, it does not trouble me that a standard conforming C compiler is free to assume that they are _not_ aliased. Either way, it does effect my main point. J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/20/91)
> You're so hyped up about the fact that separate compilation > prevents use of this optimization, but you haven't been > so vocal about the case where you deliberately alias one > of your arrays, which also blows your optimization. Since I _never_ deliberately alias array arguments either to each other or to globals to which the procedure I'm calling has access, this is not an issue to me. In _most_ code (not just mine) _most_ of the arrays that you pass around as parameters are not aliased and _most_ C compilers forgo a perfectly valid optimization in order to cater to the much rarer case where someone had "deliberately" aliased something. Deliberate aliasing happens all the time in processing recursive data structures (like linked lists, trees, graphs, etc.) but for array manipulation it is almost always a serious error - no matter how carefully the compiler works. It would be nice if C let me pass arrays to procedures _without_ converting them to pointers and it would be nice if the compiler were free to assume that array parameters were not aliased to each other or to any globals. If they become aliased to any locals, that could only happen through loaclly visible action that the compiler can see without interprocedure analysis. J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/20/91)
> [...] but why is it so horrible to you to document > to the users of your optimized libraries that certain > arguments to certain functions must be carefully > non-aliased? [...] It's not horrible. You miss my point if you believe that I would oppose that. You give me a portable way to declare to the compiler that two (or more) arguments are assumed not to be aliased and I will rejoice. The problem is that what people are recommending is _not_ portable and is _not_ an explicit declaration (so the end user has to guess which variables the compiler optimized in this way and which are safe to alias). J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/20/91)
> This has never been the case. Using pre-ANSI syntax: > B: > some_routine(123); > > A: > some_routine(arg) {... > > A after editing: > some_routine(flag, arg) Irrelevant to this discussion since it was always clearly understood (and clearly stated at the beginning of the discussion) that the valid means of communication between separately translated units were external functions and external variables. If you introduce a change which makes these incompatible, you have stepped outside the domain of this discussion. The main purpose of separate compilation is two-fold: to save compilation time when only part of the code has been modified and to allow a vendor to alter his libraries for revisions without having to distribute the _whole_ library again or distribute source. Presumably, incompatible revisions require more effort than the more common compatible ones. J. Giles
mmcg@bruce.cs.monash.OZ.AU (Mike McGaughey) (04/20/91)
bhoughto@bishop.intel.com (Blair P. Houghton) writes: >In article <21868@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >>I don't understand why you continue to hang on like grim death >>to the fiction that dependent compilation satisfies the requirements >>of the standard. >I haven't seen anyone state such a thing. What I have >seen is plenty of statements that equate to the standard's >requirement that a conforming program be translatable by >a conforming implementation into a form that can be used >to produce the output associated with a given input. It's >just tough luck if this prevents you from expecting >a certain optimization to be applied when you compile >in strictly conforming mode. And, in fact, it doesn't - if your system wants to do 'serious' interprocedural analysis while still allowing seperate, order independent compilation, there's nothing to stop it from adding an extra table to the object file describing the properties of that object file, and using a clever linker (post-compiler?) to sort all that information out. Any specification that non-static names are the only 'standard' inter-module information available applies only to C source, not object modules - after all, who ever expected object code to be portable? All we know is that they must support *at least* the information required by C. There's no reason why .o files can't consist of pure RTL. Or C source code, for that matter. Although that *would* require a better linker :). Not that I've read the ANSI spec - I'm too poor. Cheers, Mike. ps: the MIPS compilers (which are pretty good :) provide this option - you can produce ucode files instead of .o files (independantly); the ucode linker provides global optimisation. -- Mike McGaughey AARNET: mmcg@bruce.cs.monash.oz.au "His state is kingly; thousands at his bidding speed, And post o'er land and ocean without rest." - Milton.
rjc@cstr.ed.ac.uk (Richard Caley) (04/20/91)
In article <21868@lanl.gov>, Jim Giles (jg) writes: jg> Suppose jg> that functions A and B are in two separate files. Suppose that jg> I compile them 'together' in some way (that is, the compiler makes jg> some use of internal knowledge of B while translating A and vice-versa). jg> Now, suppose I _change_ A and retranslate just A (I do _not_ retranslate jg> B). The standard requires that I still be able to link both into a jg> single program. If the translation of B made any _profitable_ use jg> of internal information about A, then that information is obsolete jg> (and probably downright _wrong_) after A has been changed. Yet, the jg> standard requires that the new version of A still be usable with the jg> previously translated B. Either the use the translator made of the jg> interprocedural analysis was trivial, or B will contain unsafe jg> optimizations - the only third way is if 'translation' does not jg> include code generation (which will subsequently be done by some jg> post-translation tool: like the loader - just like I keep saying). Fourth option: When compiling `together' the compiler creates object files containing two complete sets of code, data etc. The linker looks at the object files given and if they were not all compiled together (eg if you had recompiled one) it uses the non-optimised code. I'm sure the net.experts can come up with much better solutions, this is simply the one which was most obvious to me. -- rjc@cstr.ed.ac.uk _O_ |<
mmcg@bruce.cs.monash.OZ.AU (Mike McGaughey) (04/20/91)
jlg@cochiti.lanl.gov (Jim Giles) writes: >> Once you invoke the implementation's extension that does inter-module >> optimization [...] >> [...] you have ventured outside the scope of the standard, just as if you >> had invoked a compiler option that enables some implementation-specific >> feature. >That's what I keep trying to point out. Thank heavens for someone >finally joining the discussion who understands the issue. A pity you dont. All that the above means is that ANSI C doesn't force you to optimise code. As plenty of people have pointed out, it doesn't force you *not* to optimise code either, nor does it limit the extent of your optimisations. You just optimise *after* the C-to-intermediate- code translation is done. Mike. -- Mike McGaughey AARNET: mmcg@bruce.cs.monash.oz.au "His state is kingly; thousands at his bidding speed, And post o'er land and ocean without rest." - Milton.
aipdc@castle.ed.ac.uk (Paul Crowley) (04/21/91)
In article <21815@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >I would have made the same comment if I had been comparing C to _any_ >well designed language (Pascal, Ada, Scheme, ...). C is bad. Therefore Pascal is good. Pass the sick bag. Actually, why is it that reading comp.lang.c is the one thing sure to put me off C and start me looking for a language that I like? I keep telling people it's possible to program well, and even to prove your programs, in C, but the more I read this group the less sure I get. Aren't there any good languages out there? ____ \/ o\ Paul Crowley aipdc@castle.ed.ac.uk \ / /\__/ Part straight. Part gay. All queer. \/
gwyn@smoke.brl.mil (Doug Gwyn) (04/21/91)
In article <1991Apr18.233653.19435@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >There is nothing non-conforming about a compiler that does certain >optimizations only in favorable situations, ... Note that WITHIN a translation unit various "interprocedural" optimizations may be performed to good effect, including identification of alias-free pointer arguments. One of the C compilers I routinely use goes so far as to expand small functions "in-line" in order to avoid function linkage overhead.
gwyn@smoke.brl.mil (Doug Gwyn) (04/21/91)
In article <3906@inews.intel.com> bhoughto@bishop.intel.com (Blair P. Houghton) writes: >You're so hyped up about the fact that separate compilation >prevents use of this optimization, but you haven't been >so vocal about the case where you deliberately alias one >of your arrays, which also blows your optimization. Which is one reason why this is not a C-specific issue. The fact that the Fortran standard prohibits a program from calling subroutines with arguments that are overlapping array sections, thereby enabling a certain class of optimizations (loop vectorization), means that a class of useful program constructs are also outlawed. Any reasonable language that supports pointer arguments needs to deal with this issue one way or another; Fortran came down on the side of restricting the programmer, while C came down on the side of empowering the programmer. Due to these stupid "Fortran vs. C" arguments, C implementors have looked very hard at ways to perform Fortran-like optimizations for C pointer parameter usage, and have devised a number of techniques for enabling the optimization in many circumstances. Cray's Standard C Compiler release 2.0 is an example of a C compiler that performs extensive optimization, including loop vectorization.
gwyn@smoke.brl.mil (Doug Gwyn) (04/21/91)
In article <21964@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >... the common practice of C programmers putting each function in its own >separate file makes the mistake double easy to fall into. ... This is the second "common C programming practice" you have raised in this thread that is in fact NOT a common practice among skilled C programmers. If your argument is that C requires a certain degree of expertise to use WELL, then you're wasting our time, since that has never been in question.
bhoughto@hopi.intel.com (Blair P. Houghton) (04/22/91)
In article <22021@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: [...and forgets to attribute me...] >> prevents use of this optimization, but you haven't been >> so vocal about the case where you deliberately alias one >> of your arrays, which also blows your optimization. > >Since I _never_ deliberately alias array arguments either to each >other or to globals to which the procedure I'm calling has access, >this is not an issue to me. And I don't depend on Fortrannical optimizations that can break whenever the vendor feels like it. If I want things linked that closely, I'll keep them close. [...lots of "it would be nice if C were Fortran" deleted...] Some of us like the fact that arrays are passed as pointers, and can be aliased rampantly. It's a matter of looking _both_ ways before stepping into the street, though the traffic is _supposed_ to be traveling in only one direction per lane... --Blair "Go figure."
bhoughto@hopi.intel.com (Blair P. Houghton) (04/22/91)
In article <22025@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: [...and forgets to attribute me _again_...] >> [...] but why is it so horrible to you to document >> to the users of your optimized libraries that certain >> arguments to certain functions must be carefully >> non-aliased? [...] > >It's not horrible. You miss my point if you believe that I would >oppose that. It's the best solution there is, and you still want to crock a working language. >You give me a portable way to declare to the compiler >that two (or more) arguments are assumed not to be aliased and I will >rejoice. Rejoice away: double *array_mollify(double *bar) /* Mr. Compiler, I declare that no aliasing is being done and I've put it in the man-page so you can optimize until the whole function is but an opcode */ { ... } It's as good as any, and you can forget portable. Your optimization is dependent on the cpu, the OS, the compiler, the compiler flags, and the main-memory organization. At the least. With all that against it, how bad would it be if it's also dependent on the source language? How good would it be to understand finally that we've been telling the truth for three days when we say it is most definitely NOT dependent on the source language? >The problem is that what people are recommending is _not_ portable >and is _not_ an explicit declaration (so the end user has to guess >which variables the compiler optimized in this way and which are >safe to alias). Are you doing this deliberately? Nobody has to guess a thing. You tell them. "Argument 1 is safe; when argument 2 is unaliased and you use the -bletch flag, the compiler is free to optimize the arrays in the Fooby way; if you specify -bletch but alias argument 2, grave danger will befall the system." On the other hand, it's portable as long as you don't expect it. You can't expect it. Optimization is irrelevant. Portability is given. If I "translate" a conforming program into sign language and get Koko the gorilla to perform the correct operations on the data, it's still ANSI. --Blair "Koko knows the way to the Ice cream store, and cried when she was told her kitten had died, but I bet she doesn't do pointer-abalias-dependent optimizations..."
jlg@cochiti.lanl.gov (Jim Giles) (04/22/91)
> And, in fact, it doesn't - if your system wants to do 'serious' > interprocedural analysis while still allowing seperate, order > independent compilation, there's nothing to stop it from adding an > extra table to the object file describing the properties of that object > file, and using a clever linker (post-compiler?) to sort all that > information out. Sounds exactly like what _I_ keep saying: the functionality _can_ be performed in a standard conforming way by the loader or some other load-time tool. J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/22/91)
> Scenario #1 (demonstration of global-optimization): > > $ cc -O3 -o foobar.o -c foo.c bar.c > $ cc foobar.o > > Scenario #2 (demonstration of the *required* functionality of incremental > compilation): > > $ cc -O -c foo.c > $ cc -O -c bar.c > $ cc foo.o bar.o The point you seem to miss is that after scenario 1, the vendor will ship foobar.o to the customer. Later, the vendor may change _just_ bar.c. He will recompile it and send bar.o to the customer. The customer will then link bar.o and foobar.o to yield a program with the new version of bar in it. The standard says the vendor _may_ do this. Which means that the implementation _must_ allow it to be done. Since the end user never has foo.c (and doesn't even know whether foo was really written in C or not), scenario 2 is not relevant to the discussion. J. Giles P.S. As I keep pointing out, the _loader_ (or some load-time tool) _can_ satisfy the standard and still do intermodule optimization. The 'translator' (usually thought of as the compiler) cannot.
jfw@ksr.com (John F. Woods) (04/22/91)
jlg@lanl.gov (Jim Giles) writes: >In article <1991Apr18.190403.29049@Think.COM>, barmar@think.com (Barry Margolin) writes: >> I interpreted the section of the standard that you quoted in another >> posting as *allowing* separate compilation. This means that the programmer >> isn't forced to compile everything together in order to achieve the >> semantics specified by the language. [...] >The standard _allows_ the _USER_ to decide whether to compile separately >or together. The standard allows the user to have "translation" done in separate pieces. For example, my compilation system generates object modules by translating all comments to Swahili and stores the resulting C code using LZW compression. The job of the linker is to turn the object modules, plus requested libraries, into an executable (I usually skip the phase where it logs the comments (in Sanskrit) to the system console, in order to save paper). The only thing that the standard ALLOWS users to depend on is external linkage. As long as this compilation system obeys the famous "AS IF" rule, it is standard conformant, end of discussion. > Sorry, those are the rules. Try understanding the rules before yapping about them. For example, > And, as I said, if the standard says you _may_ > do something, the implementation _must_ permit you to do it. If a compilation system waits until it knows where all the source code is to perform code generation, what is the user not being permitted to do? Get slow code? No thanks!
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (04/23/91)
In article <22004@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: > It has been suggested that a program of the following type constitutes > a counterexample to my previous statements about pointers and aliasing: > func(i,j) > char *i; > float *j; Well, no, that's not what I said, and you should learn the difference between char/void pointers and non-char/void pointers. > However, my original statement about pointers and aliasing was that > pointer arguments to the _same_ underlying type must be assumed to be > aliased to each other and to all globals of that type Jim, I despise the way that you are trying to escape your mistake by failing to quote what you claim to be responding to. You were arguing with Doug Gwyn; you said ``I have never seen any of these hypothetical contexts in which anonymous pointer arguments passed in as parameters can be safely assumed not to be aliased.'' I showed you how a very common type of sorting routine provided exactly such a context. Now are you saying you've never seen such sorting routines? Or do you admit that you simply haven't thought through the issues here? ---Dan
jlg@cochiti.lanl.gov (Jim Giles) (04/23/91)
> [...] Any reasonable language that > supports pointer arguments needs to deal with this issue one > way or another; Fortran came down on the side of restricting > the programmer, while C came down on the side of empowering > the programmer. [...] No. Fortran restricted the programmer in one way, C restricted the programmer in a different way. In Fortran, if I want to do things in an overlapping way to a single array, I must pass that array around by itself and make the overlap explicit everywhere it occurs. In return for that constraint, the compiler optimizes my code _much_ better than it could if it had to assume all array arguments were aliased to each other. In C, the compiler prevents me from telling the compiler in any way whatsoever that pointer arguments are not aliased - even though they _usually_ aren't. In return, I get the dubious advantage of hiding overlap from procedures. For a language to _empower_ the user, it must provide a portable and explicit way of telling the compiler whether two arguments are allowed to be aliased. Neither language does this. In fact, no language in popular use does. J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/23/91)
In article <RJC.91Apr20023816@brodie.cstr.ed.ac.uk>, rjc@cstr.ed.ac.uk (Richard Caley) writes: |> In article <21868@lanl.gov>, Jim Giles (jg) writes: |> |> jg> [...] the only third way is if 'translation' does not |> jg> include code generation (which will subsequently be done by some |> jg> post-translation tool: like the loader - just like I keep saying). |> |> Fourth option: |> |> When compiling `together' the compiler creates object files |> containing two complete sets of code, data etc. The linker looks at |> the object files given and if they were not all compiled together (eg |> if you had recompiled one) it uses the non-optimised code. Same as my third option. The load-time tool actually does the interprocedural analysis. Your solution merely makes the loader's 'code generation' duties simpler at the expense of more expensive compilation and larger object files. And, again, no implementation does this (yet). J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/23/91)
>> And, as I said, if the standard says you _may_ >> do something, the implementation _must_ permit you to do it. > > If a compilation system waits until it knows where all the source code is > to perform code generation, what is the user not being permitted to do? > Get slow code? No thanks! If the compilation system waits until it knows where all the source code is to perform code generation, it's got a long wait coming. Many vendors don't distribute source code. Your compilation system is _never_ going to know where the vendor's source code is. What you are not permitting the user to do is one of the very things that separate compilation is _for_ - vendor distribution of _object_ code. If your compiler/language requires source to get fast code, you should think about changing languages. J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/24/91)
bhoughto@hopi.intel.com (Blair P. Houghton) writes: > [...] How good would it be > to understand finally that we've been telling the truth > for three days when we say it is most definitely NOT > dependent on the source language? Please hurry up and contact the X3J11 committee with this news! They have a subcommittee (the Numerical C Extensions Group - NCEG) who are wasting all sorts of time trying to correct this aliasing problem that C has. Since you _know_ it's not a problem, they are obviously all idiots. You should hurry to set them straight. Whoops! Forgot to mention: NCEG is also affiliated with WG14 (the ISO C working group). So you better set those guys straight too. (It couldn't be that these people know something that Blair Houghton doesn't? Nah - no chance.) J. Giles
rjc@cstr.ed.ac.uk (Richard Caley) (04/24/91)
In article <22245@lanl.gov>, Jim Giles (jg) writes:
jg> Sounds exactly like what _I_ keep saying: the functionality _can_
jg> be performed in a standard conforming way by the loader or some
jg> other load-time tool.
There have been at least two postings giving methods by which a
conforming implementation can do inter-file optimisations with the
linker doing nothing but link and allowing the user to later change
and recompile one file at the expense of loosing the optimisations.
Do you read this group?
--
rjc@cstr.ed.ac.uk
martelli@cadlab.sublink.ORG (Alex Martelli) (04/24/91)
rkowen@violet.berkeley.edu (Dario Bressanini) writes:
...
:When i REALLY HAVE to optimize a program, first of all i use
:a profiler to see where is the bottleneck, and THEN i try to optimize it;
:probably I am biased since i mostly write programs (90% in FORTRAN)
:for scientific computation (yes, I use floating points :-) ) where usually
:you have a single routine, or even a single loop, that takes 95% of
:the CPU time.
Ciao Dario! In interactive tasks such as solid modeling and CAD, you
would still find (in my experience) 'single-bottlenecks' responsible for
heavy slowdown of many tasks (no, not 95%!, but 30-40% are not uncommon),
except that different tasks, and different users, would hit different
bottlenecks - as a commercial program evolves over many releases, the
various 'specific bottlenecks' are found and removed.
:In most cases the best way to gain speed was to change completely
:the algorithm, and not to make some very low level optimization.
I can't argue with that - particularly as the bottlenecks are removed
over the life of an evolving program, there are times when you must
fully redo the design of a major subsystem, including all data structures
and often even the interfaces to other subsystems, for REAL performance
improvement!
:Following the latest "this is faster than that" wars I had
:the impression that they were pure void theoric discussions, without any
:connection with the "real world", at least to my world.
Since when do you mind 'pure void theoric discussions', DB...?-) {Personal
joke warning, me and DB are old fidonet friend/enemies!-}
:Just in case.... I don't want to start the usual and useless war
:C vs FORTRAN etc..,i would like to use C for my programs, but in most
:cases i tried, the Code produced by the C compiler was 2 or 3 times
:slower that the fortran code.
On what platforms? On workstation-class hardware, my experience is
just the reverse: despite all the theoretical advantages of Fortran,
which SHOULD be vastly easier to optimize for numerical computation
(no aliasing allowed...), I find most Fortran-generated code quite a
bit slower than C. I posted a benchmark once where on an incredibly pure
numerical computation, a 2D FFT over a 256x256 complex matrix, you
could even go faster by converting Fortran to C with f2c, then compiling
the C results, than by directly compiling the Fortran source!!! That
was on Sparc workstations with C and Fortran compilers of 18 months
ago, and I hope *this* particular aberration has since been removed,
but others remain and indeed loom larger and larger (bulk unformatted
I/O for checkpointing, for example, surely an important task for any
long-running program where the operating system is not so kind as to
do checkpointing for you transparently). On PC's, it's even worse, as
if compiler vendors barely cared for Fortran performance (particularly
in bulk I/O) but were locked in a mad race for C performance, or
something like that...
I do recall from my mainframe/supercomputing days that it USED to be
different - blazing FAST fortran-generated code, versus barely passable
C-generated code - although I DON'T recall any '2 or 3 times' difference,
possibly because I worked on machine where double precision math was
just as fast as single precision (I believe this is something of a
'trademark' of IBM machines, from 3090's down to PCs with FP coprocs,
and also including RS6000's WSs) - modern C's should allow you to use
single precision anyway where/when it matters (ANSI allows that, Sun C
while non-ANSI offers a special option for this, etc). Anyway I'm not
surprised if your results are on supers/mainframes, since apparently
those guys are GOOD at writing Fortran compilers (maybe it is that they
CARE about it!), while their WSs colleagues appear to be either less
able or less motivated...
If your results are instead obtained on WSs/PCs, we should exchange more
detailed notes - either you don't optimize your C well, or I don't optimize
my Fortran well... and this DOES matter to me, since, differing from you,
I'd like to keep using Fortran for those numerically-heavy parts of our
programs where Fortran does perfectly fine, thank you, without having to
pay a performance PRICE for that, however!!!
--
Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 53, Bologna, Italia
Email: (work:) martelli@cadlab.sublink.org, (home:) alex@am.sublink.org
Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434;
Fax: ++39 (51) 366964 (work only), Fidonet: 332/401.3 (home only).
rob@kaa.eng.ohio-state.edu (Rob Carriere) (04/25/91)
In article <22246@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) spouts forth: >P.S. As I keep pointing out, the _loader_ (or some load-time tool) >_can_ satisfy the standard and still do intermodule optimization. >The 'translator' (usually thought of as the compiler) cannot. Err, yes it can. Someone else has already explained how to do this by including optimized and non-optimized versions of the code in the same object file. Since I assume that you read less selectively than you answer, I won't repeat his scheme. So, where does this leave us? We now have 2 conforming implementations of intermodular analysis. Mister Giles started this whole brouhaha with a claim that C coudn't be optimized as well as FORTRAN, since you needed to do IM analysis for that and the standard didn't allow such things. I would think that 2 implementations count as sufficient counterexamples, so would mister Giles either admit he was wrong or else shut up? Nice try to divert the argument, though. SR ---
rjc@cstr.ed.ac.uk (Richard Caley) (04/25/91)
In article <22355@lanl.gov>, Jim Giles (jg) writes: jg> In article <RJC.91Apr20023816@brodie.cstr.ed.ac.uk>, rjc@cstr.ed.ac.uk (Richard Caley) writes: jg> [...] the only third way is if 'translation' does not jg> include code generation (which will subsequently be done by some jg> post-translation tool: like the loader - just like I keep saying). rjc> When compiling `together' the compiler creates object files rjc> containing two complete sets of code, data etc. The linker looks at rjc> the object files given and if they were not all compiled together (eg rjc> if you had recompiled one) it uses the non-optimised code. jg> Same as my third option. No, you said the `linker' would have to do code analysis and code generation, I said it would have to make a choice. If the difference is not clear, I suggest you try and code them. How big do you think interprocedureal analysis plus code generation is? I think it is clear that choosing one of two sets of definitions takes about 100 including totally paranoid levels of defensive tests. jg> The load-time tool actually does the interprocedural analysis. Nope, it doesn't even need to know that the difference between the alternate definitions _is_ interprocedural analysis. jg> Your solution merely makes the loader's jg> 'code generation' duties simpler at the expense of more expensive jg> compilation and larger object files. And, again, no implementation jg> does this (yet). Larget objects I'll grant you, I never said it was a comercially viable solution, just the simplest to explain in a news message. Ps. Re: `common practices'. The only C code I have ever seen written one procedure to a file was the result of letting a PL/I and PL6 person loose with CC. name deleted to protect the not so innocent :-) -- rjc@cstr.ed.ac.uk _O_ |<
pa@curly.appmag.com (Pierre Asselin) (04/26/91)
It's late. OK, I'm-a-byt'n... In article <1991Apr24.174057.22470@ee.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu (Rob Carriere) writes: >In article <22246@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) spouts forth: >>P.S. As I keep pointing out, the _loader_ (or some load-time tool) >>_can_ satisfy the standard and still do intermodule optimization. >>The 'translator' (usually thought of as the compiler) cannot. > >Err, yes it can. Someone else has already explained how to do this by >including optimized and non-optimized versions of the code in the same object >file. [...] I would think that 2 implementations count as sufficient >counterexamples [...] But does this sheme really count? Suppose there are N modules subject to interoptimization. Translating any one of them leads to a 2^(N-1)-way branch as to what set of optimizations is allowed. Hmmm... or 2^M,, where M is the number of optimization tricks the compiler knows about. Still too big, though. On that thought, good night. --Pierre Asselin, R&D, Applied Magnetics. I speak for me.
wilber@alice.att.com (rew) (04/26/91)
J. Giles writes: >> [...] but why is it so horrible to you to document >> to the users of your optimized libraries that certain >> arguments to certain functions must be carefully >> non-aliased? [...] > >It's not horrible. You miss my point if you believe that I would >oppose that. You give me a portable way to declare to the compiler >that two (or more) arguments are assumed not to be aliased and I will >rejoice. > >The problem is that what people are recommending is _not_ portable >and is _not_ an explicit declaration (so the end user has to guess >which variables the compiler optimized in this way and which are >safe to alias). I like C. I don't like Fortran. I find the likelihood that people will still be coding in Fortran in the year 2020 to be depressing. Having said that, I must say that in my experience the criticism that J. Giles makes of C, that problems with potential aliasing make it *very* difficult for optimizing compilers to do their job, is a legitimate one. I am working on a large program that does schedule optimization. Speed is critical. Parts of it are written in C, and most is written in C++, which is translated into C by cfront. (BTW, cfront exacerbates the problems with aliasing by introducing large numbers of spurious pointers.) What I find is that Fortran code regularly runs circles around carefully coded C and C++. The problems intensify on RISC machines, which are becoming all the rage, because of their dependence on keeping data in registers for speed. On an IBM RS6000 a little test program I wrote in C runs about 6 times slower when I use pointers to scalars than when I use local scalars directly. One might expect that a factor of 2 could be due to the need to dereference the pointers, the rest is because most of the optimizations have to be turned off as soon as pointers are used. Unfortunately, once you get past the realm of toy test programs storing all your data in local variables is a non-option -- most of the data is lurking in giant arrays and malloced structures, all accessed through pointers. I think that the pro-C camp has been too willing to sweep under the rug the difficulties of making a C compiler deal with aliasing. Sure, global optimization across all 200 .c modules would solve many of the problems, but how many C compilers actually do that? Can you name 3 machines I can buy right now that have such a beast? And when you try to do that sort of thing, you quickly bump up against the limits of your machine. (Can you say, "Virtual memory exhausted"? Can you say, "Two day compilation"?) Cross procedural optimization within a module doesn't buy you as much as you might hope; simply knowing that none of the calls to foo() within foo.c alias the parameters doesn't help if foo() is declared external (the default) because some call to foo() in another module might alias the parameters. And I'd say that less than 10% of all functions in typical C code are declared static. (Even when functions aren't called outside of their modules, programmers rarely bother to declare them static.) So what *can* be done? As has already been pointed out, it is not unreasonable for a compiler to assume that non-char, non-void pointers of different types are not aliased. An option like -this_programmer_should_be_fired should of course be available to turn this assumption off, for things like compiling code taken from the net. I would go further and allow an option called, say, -strict_typedefs that would tell the compiler that in a case like this: typedef int Apple; typedef int Orange; that an int can be an Apple or an Orange (or neither) but an Apple can't be an Orange and an Orange can't be an Apple. Thus in typedef long Cents; typedef long Quantity; Cents calculate_costs(Cents c[], Quantity q[]); the compiler knows that c and q aren't aliased. This would of course have to be an option because there is nothing in the definition of the language that requires that typedefs should behave this way. But what can be done about cases like this? foo(char* p, char* q, char* s); One can of course come up with various schemes involving macros or comments that inform your favorite compiler that the parameters aren't aliased, but the problem is that there are no standards for how this is to be done, so any such information is likely to be understood *only* by your favorite compiler, and no other. Right now, in the Real World (TM), optimized Fortran really does run faster than optimized C. A situation that truly sucks. Bob Wilber wilber@homxb.att.com
jlg@cochiti.lanl.gov (Jim Giles) (04/26/91)
In article <1991Apr24.174057.22470@ee.eng.ohio-state.edu>, rob@kaa.eng.ohio-state.edu (Rob Carriere) rants: |> |> Err, yes it can. Someone else has already explained how to do this by |> including optimized and non-optimized versions of the code in the same object |> file. Since I assume that you read less selectively than you answer, I won't |> repeat his scheme. Yes, and I've already pointed out that this still relies on the _loader_ (or some post-translation tool) to make the actual decisions about what optimizations to use. Since I assume that you read less selectively than you answer, I won't repeat the discussion. |> [...] Mister Giles started this whole brouhaha with a claim |> that C coudn't be optimized as well as FORTRAN, since you needed to do IM |> analysis for that and the standard didn't allow such things. No, I didn't. I started this discussion by saying that C _isn't_ optimized as well as Fortran since you need IM analysis for that and the standard doesn't allow such things in the _translator_ and _no_ C implementation I've ever seen has the _loader_ (or any post- translation tool) do the job. Only _one_ C implementation has been brought to my attention which does IM analysis _at_all_. This is the MIPS C compiler, which does IM analysis in a non-standard way by requiring dependent compilation. Since I don't have access to the MIPS compiler, this info is of only academic interest. So - optimization of pointer arguments to procedures requires IM analysis in C. The translator can't do this and be standard conforming. If the translator leaves sufficient information in the object files, the loader (or some load-time tool) can make the optimizations (even if this only results in selecting among several versions that the compiler produced). In NO present implementation that does IM analysis is the method standard conforming. Finally, as I have pointed out before, I consider the MIPS method perfectly acceptable for all but commercial function libraries (which are inherently separately compiled so the vendor doesn't have to distribute source). There is nothing wrong with an implementation which provides an interesting capability as an _extension_ to the standard. What I object to is the claim that the extension _is_ standard. J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/27/91)
In article <RJC.91Apr25134502@brodie.cstr.ed.ac.uk>, rjc@cstr.ed.ac.uk (Richard Caley) writes: |> [...] |> No, you said the `linker' would have to do code analysis and code |> generation, I said it would have to make a choice. If the difference |> is not clear, I suggest you try and code them. How big do you think |> interprocedureal analysis plus code generation is? I think it is clear |> that choosing one of two sets of definitions takes about 100 including |> totally paranoid levels of defensive tests. I'll show you what needs to be done, you show me how the it can be done without interprocedural analysis. File1: File2: int a, b; func1(x) main (){ int *x; ... { func1(&a) func2(x); func3(x); ... ... } } File3: File4: extern int a; extern int b; func3(y) func3(z) int *y; int *z { { expression(a,y); expression(b,z}; } } These four files are separately compiled. If loaded together, they will produce a program in which y and a are aliased in func2(); also b and z will _not_ be aliased in func3() (because z is really a pointer to a). Now, the optimization I want is for the final code to optimize func3() as much as possible since there is no aliasing. But, the solution must also inhibit unsafe optimizations in func2() because there _is_ aliasing. Now, you show me your specific solution and then we'll be able to decide whether it works, whether it is standard conforming, whether is generalizes to more than one arg/global, and whether it does interprocedural analysis or not. |> [...] |> jg> The load-time tool actually does the interprocedural analysis. |> |> Nope, it doesn't even need to know that the difference between the |> alternate definitions _is_ interprocedural analysis. If the method you are proposing manages the above example correctly, then the choice the load-time tool _is_ intermodular analysis. Whether it "knows" it or not is an irrelevant point. I don't think that any program yet written can be said to _know_ anything. J. Giles
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (04/27/91)
In article <RJC.91Apr25134502@brodie.cstr.ed.ac.uk> rjc@cstr.ed.ac.uk (Richard Caley) writes:
[ quite reasonable suggestions ]
Gee, sounds like exactly the same argument as we had in comp.lang.misc.
Jim said that because C *allows* aliasing, it *forces* optimizers to
assume aliasing at every step. I said that the compiler could put
together any number of versions of each subroutine, one for each set of
anti-aliasing constraints chosen by any fixed algorithm.
It could, for instance, compile one version where no arguments are
aliased, and one where all arguments are aliased. In the case Jim cares
about (i.e., Fortran-ish array code), nothing would be aliased, so the
optimized versions would be used in every case. So the C code would be
optimized *exactly* as well as a Fortran version, with *no*
interprocedural analysis on the linker's part.
How quickly Jim forgets.
Well, no, I should be fair: he never admitted to losing the argument
that time, so he probably doesn't realize that he's lost it this time.
Poor guy. Still, Caley and I came up with the solution independently,
and probably any compiler writer who doesn't have such a closed mind as
Jim can come up with it too, so I don't think Jim's prejudice will hurt
the future of C optimization.
---Dan
bhoughto@pima.intel.com (Blair P. Houghton) (04/27/91)
In article <22392@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >bhoughto@hopi.intel.com (Blair P. Houghton) writes: >> for three days when we say it is most definitely NOT >> dependent on the source language? > >Whoops! Forgot to mention: NCEG is also affiliated with WG14 (the >ISO C working group). So you better set those guys straight too. >(It couldn't be that these people know something that Blair Houghton >doesn't? Nah - no chance.) The thing they know that I don't is a good reason why it should be necessary to force certain data into a non-aliased paradigm rather than trusting the programmer to keep certain data from being aliased. The best bad reason is that standards are better at enforcement than documentors are. It's neither my fault, my concern, nor my problem that they're chasing pots of dingy gold at the end of haggard rainbows, and I certainly don't want to be taxed for it later by having to pay an extra $20 per copy of the Standard for 100 pages of errata explaining the contortions one must go through to define a non-aliasing implementation. They can elide the trigraphs sections while they're at it, too. Their best bet is to do what we've all done: wait for a few examples of prior art, and then choose from the good ones. It's a perpetual fool's chase, trying to design by committee what should be designed by evolution. X3.159-1989 shows in spots the effects of this political methodology to science, and I'm thrilled to say that the noalias crock isn't one of the mistakes they made. --Blair "If you want f77(1), you know where to find it."
bhoughto@pima.intel.com (Blair P. Houghton) (04/27/91)
In article <20270@alice.att.com> wilber@homxb.att.com writes: >One can of course come up with various schemes involving macros or comments Or #pragma's... >that inform your favorite compiler that the parameters aren't aliased, but the >problem is that there are no standards for how this is to be done, so any >such information is likely to be understood *only* by your favorite compiler, >and no other. It's likely to be so, anyway. If I have invented an optimization that makes 6x gains in speed of operations on pointer-accessed data, how fast do you expect me to propagate that to my marketplace competitors? Patent Office or no, it's going to take a long time and cost someone a lot of money to get this little feature into any more compilers. >Right now, in the Real World (TM), optimized Fortran really does run faster >than optimized C. A situation that truly sucks. Funny, that happened in '64, too. --Blair "...everyone's figuring out how old we were..."
torek@elf.ee.lbl.gov (Chris Torek) (04/27/91)
(I have tried to stay out of this, as arguing with Jim Giles is bad for my disposition---everything he writes seems to me a vicious attack, as if his postings were little rabid dogs, poised to bite the unwary; but that is only my opinion. At any rate, regardless of the merits of some of his points [there are some merits!] I wish to make two comments.) In article <22687@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >Only _one_ C implementation has been brought to my attention which >does [inter-module] analysis _at_all_. This is the MIPS C compiler ... The MIPS compiler is definitely not the only such, though it may be the only commercial one. For instance, the Plan 9 compilers do much of their optimization in the linker; there may be a special flag to disable this in case the linker's optimizer is broken, but the default is inter-module optimization. (Indeed, the Plan 9 group have decided they may have gone too far in that direction, as it takes over 30 seconds to compile the Plan 9 kernel from scratch, and if more of the optimization were done in the `compiler' part more of it could be done in parallel. Would that gcc could compile BSD in under a minute, or even under 10 ... ah well.) >Finally, as I have pointed out before, I consider the MIPS method >perfectly acceptable for all but commercial function libraries (which >are inherently separately compiled so the vendor doesn't have to >distribute source). ... Jim seems to be under the impression that in order to get the MIPS Ucode optimizer to function, you must give it source. This is not the case. One of the compiler options causes it to emit `.u' files (so named simply to avoid confusion, no doubt; the only difference between a .u and a .o file is that the latter contains much more machine code and much less intermediate information). These .u files can be bound up in a library. They are not source code (there are no syntax errors in .u files) and they certainly could be distributed commercially. They do require more time to load than an equivalent library of MIPS .o files, which some commercial vendors might view an impossible barrier, but that would be silly (vendors could distribute both formats, one for `test compilation' and one for `final compilation', if compilation time is considered important). -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov
sarima@tdatirv.UUCP (Stanley Friesen) (04/28/91)
In article <790@cadlab.sublink.ORG> martelli@cadlab.sublink.ORG (Alex Martelli) writes: >rkowen@violet.berkeley.edu (Dario Bressanini) writes: ... >:When i REALLY HAVE to optimize a program, first of all i use >:a profiler to see where is the bottleneck, and THEN i try to optimize it; >: ... >Ciao Dario! In interactive tasks such as solid modeling and CAD, you >would still find (in my experience) 'single-bottlenecks' responsible for >heavy slowdown of many tasks (no, not 95%!, but 30-40% are not uncommon), >except that different tasks, and different users, would hit different >bottlenecks - as a commercial program evolves over many releases, the >various 'specific bottlenecks' are found and removed. Quite, I too optimize after profiling. And most of the time there seems to be one specific bottleneck. One of the most interesting I have ever found was one involving union assignment. The program had a union somewhat like this: union any_type { int i; long l; double d; char *str; struct stuff s; }; At several places in the code the original writr had used union/struct assignment to copy the value of this union. The compiler dutifully inserted a tight copy loop to copy the entire union for each assignment. As a result the program spent nearly 1/3 of its time copying the union!! I simply replaced each assignment with a switch on the actual type and assigned only the active member of the union - voila a *huge* improvement. So, be watch out for this one! (It took me several days to figure out why simple linear code with no (explicite) loops was using up all of the time!! I had to disassemble the binary and locate the in-line copy loop) -- --------------- uunet!tdatirv!sarima (Stanley Friesen)
burow@cernvax.cern.ch (burkhard burow) (04/28/91)
Everybody's favourite topic ..... jlg@cochiti.lanl.gov (Jim Giles) gives us the example >I'll show you what needs to be done, you show me how the it can be >done without interprocedural analysis. > >File1: File2: > int a, b; func1(x) > main (){ int *x; > ... { > func1(&a) func2(x); func3(x); > ... ... > } } > >File3: File4: > extern int a; extern int b; > func3(y) func3(z) > int *y; int *z > { { > expression(a,y); expression(b,z}; > } } > Now lets rewrite this in everybody's favourite language File1: File2: common/puke/ia,ib program demo func1(ix) .... .... call func1(ia) call func2(ix) ... call func3(ix) end ... end File3: File4: common/puke/ia,ib common/puke/ia,ib func2(iy) func3(iz) expression with ia,iy expression with ib,iy end end Don't laugh, this little bit of FORTRAN almost killed me. And as with the original posting, we're all going to ignore little syntax errors that may have creeped in. Now jlg@cochiti.lanl.gov (Jim Giles) gave this challenge: >These four files are separately compiled. If loaded together, they will >produce a program in which y and a are aliased in func2(); also b and z >will _not_ be aliased in func3() (because z is really a pointer to a). >Now, the optimization I want is for the final code to optimize func3() >as much as possible since there is no aliasing. But, the solution must >also inhibit unsafe optimizations in func2() because there _is_ aliasing. Now I just don't get it. If the C compiler/linker/what_ever_you_call_it can't optimize the nonalias in func3, and not screw up with the alias in func2, what's so magical about FORTRAN? It would seem to be in the same boat. I fear that I'm on the wrong side of ignorance, so I would appreciate an explanation. A bit o' irelevant background: I must admit that I care much more about the programmers performance (mainly me!) than I do about the machines performance. I'm a C hack swimming in a FORTRAN world. To keep me from drowning I've created created CFORTRAN, a set of macros that 'prototype' FORTRAN functions. e.g. When my friends do: CALL HBOOK1(1,'pT spectrum of pi+',100,0.,5.,0.) I can do: HBOOK1(1,"pT spectrum of pi+",100,0.,5.,0.); after I've 'prototyped' the FORTRAN function: PROTOCCALLSFSUB6(hbook1,INT,STRING,INT,FLOAT,FLOAT,FLOAT) #define HBOOK1(ID,CHTITLE,NX,XMI,XMA,VMX) \ CCALLSFSUB6(hbook1,INT,STRING,INT,FLOAT,FLOAT,FLOAT, \ ID,CHTITLE,NX,XMI,XMA,VMX) I create the prototypes (about 20 FORTRAN routines per hour, main bottleneck is my typing speed) using info. from the FORTRAN routines' documentation. i.e. My C code uses 'FORTRAN' object code, without me ever looking at, or dealing with a single line of FORTRAN source code. I've got CFORTRAN running, i.e. I've created a cfortran.h file with the preprocessor directives, on RISC Mips (e.g. Silicon Graphics, DECstations) and VAX VMS machines. I'm still trying to sell the idea to the world, but have already generated some bootstrap interest. If you're curious, send mail. QQQQuestion: The clarification I ask for above is actually a serious one, for I'd like to know if with my pointer freedom in C, I can pass arguments to a FORTRAN routine which will blow up, because of its optimizations based on restrictions in FORTRAN. QQQQuestion: Lots of manuals have little sections on calling and linking different languages within the same program. How many people out there actually do this sort of thing? e.g. How many people out there use C, handling the messy jobs of i/o, user interface, etc., as a frontend to number crunching FORTRAN routines? It doesn't matter whether you call the FORTRAN routines because of performance reasons, or as in my case, because they exist and work. hoping somebody reads this line. burkhard burow%13313.hepnet@csa3.lbl.gov
rob@mckinley.eng.ohio-state.edu (Rob Carriere) (04/29/91)
In article <703@curly.appmag.com> pa@appmag.com (Pierre Asselin) writes: [about the 2-objects-in-1 scheme of IM optimization] >But does this sheme really count? Suppose there are N modules subject >to interoptimization. Translating any one of them leads to a >2^(N-1)-way branch as to what set of optimizations is allowed. >Hmmm... or 2^M,, where M is the number of optimization tricks the >compiler knows about. Still too big, though. Well, this is why us academic types invented the thought experiment... :-) Seriously, the proposed implementation was just that: a proof of concept. A real implementation would be a lot smarter about it. If you build an RCS-like structure of diffs, I'm pretty sure you can get away low space and time wastage. Or you could be lazy and either IM optimize or not at all. The idea was, after all, to encourage the programmer to use a compilation scheme that provides IM optimzation, but still allow compilation in the general case for standard conformance. In other words, the executable that results when the linker goes `oops' can stink as long as it works well enough to conform to the standard. SR ---
rob@mckinley.eng.ohio-state.edu (Rob Carriere) (04/29/91)
In article <20270@alice.att.com> wilber@homxb.att.com writes: >Cross procedural optimization within a module doesn't buy you as much as you >might hope; simply knowing that none of the calls to foo() within foo.c alias >the parameters doesn't help if foo() is declared external (the default) because >some call to foo() in another module might alias the parameters. And I'd say >that less than 10% of all functions in typical C code are declared static. >(Even when functions aren't called outside of their modules, programmers rarely >bother to declare them static.) Well, the idea behind C has always been "the programmer knows what s/he is doing." Since that programmer _should_ have declared those fns static, I don't see where I should be bothered by the fact that not doing so may inhibit optimization. >Right now, in the Real World (TM), optimized Fortran really does run faster >than optimized C. A situation that truly sucks. It seems to me that, like mister Giles, you are confusing properties of the language with properties of the implementation. It would sure be nice to have those wonderful compilers today, but their presence or absence has nothing to do with the merits or lack thereof of the language C. SR ---
rob@mckinley.eng.ohio-state.edu (Rob Carriere) (04/29/91)
In article <22687@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes:
!Yes, and I've already pointed out that this still relies on the
!_loader_ (or some post-translation tool) to make the actual decisions
!about what optimizations to use. Since I assume that you read less
!selectively than you answer, I won't repeat the discussion.
Good. It would have been totally irrelevant. I do, however, want to
apologize about the selective reading gibe. Your answer to the post I was
refering to got here out of sequence (to use an understatement). I should
have realized that USENET will occasionally do such wonderful things.
!No, I didn't. I started this discussion by saying that C _isn't_
!optimized as well as Fortran since you need IM analysis for that
!and the standard doesn't allow such things in the _translator_ and
!_no_ C implementation I've ever seen has the _loader_ (or any post-
!translation tool) do the job. Only _one_ C implementation has been
!brought to my attention which does IM analysis _at_all_. This is
!the MIPS C compiler, which does IM analysis in a non-standard way
!by requiring dependent compilation. Since I don't have access to
!the MIPS compiler, this info is of only academic interest.
Ah. My mistake. I thought you were talking about C. What, pray tell, is a
discussion about compilers doing in comp.lang.c?
SR
---
rob@mckinley.eng.ohio-state.edu (Rob Carriere) (04/29/91)
In article <5079@cernvax.cern.ch> burow@cernvax.cern.ch (burkhard burow) writes: >jlg@cochiti.lanl.gov (Jim Giles) gives us the example [IM alias in C] >Now lets rewrite this in everybody's favourite language [same example in FORTRAN] >Now I just don't get it. If the C compiler/linker/what_ever_you_call_it can't >optimize the nonalias in func3, and not screw up with the alias in func2, >what's so magical about FORTRAN? It would seem to be in the same boat. Nope, it isn't. FORTRAN simple states that such programs ar illegal. (undefined behavior is the C standard speak, I'm no longer sure of the FORTRANesque). So basically, in FORTRAN you are responsible for staying out of the compiler's way, and in C it's the other way around. This makes for safer programming on the C side of the fence (never thought you'd hear _that_ about C, right :-), but it also makes the C compiler's[1] job _much_ harder. >hoping somebody reads this line. I did. :-) SR [1] compiler is used here to mean translator+linker+loader. It was convenient, and I feel like Humpty Dumpty today. :-) ---
shap@shasta.Stanford.EDU (shap) (04/29/91)
Smart compilers to interprocedural optimization between files by understanding how to take multiple file names on the command line. This way no magic in the linker is required, though I grant it helps.
mmcg@bruce.cs.monash.OZ.AU (Mike McGaughey) (04/29/91)
jlg@cochiti.lanl.gov (Jim Giles) writes: >the MIPS C compiler, which does IM analysis in a non-standard way >by requiring dependent compilation. No it doesnt. Even with full MIPS IM optimisation, compilation order is independent (it creates .u files; if you apply their .u-to-symbolic translator to them, you'll find that they look suspiciously like RTL). Mike. -- Mike McGaughey AARNET: mmcg@bruce.cs.monash.oz.au "His state is kingly; thousands at his bidding speed, And post o'er land and ocean without rest." - Milton.
jlg@cochiti.lanl.gov (Jim Giles) (04/29/91)
In article <5079@cernvax.cern.ch>, burow@cernvax.cern.ch (burkhard burow) writes: |> Everybody's favourite topic ..... |> |> jlg@cochiti.lanl.gov (Jim Giles) gives us the example |> [...] |> >File3: File4: |> > extern int a; extern int b; |> > func3(y) func3(z) ^ Did I actually say func3 in both places? Sorry - the one if file3 should be func2. |> Now lets rewrite this in everybody's favourite language |> |> File1: File2: |> common/puke/ia,ib |> [...] |> File3: File4: |> common/puke/ia,ib common/puke/ia,ib To be the same as the C example, there would have to be two common blocks: one with just ia and the other with just ib. The ia one is common with file3, the ib one with file4. |> [...] |> Now I just don't get it. If the C compiler/linker/what_ever_you_call_it can't |> optimize the nonalias in func3, and not screw up with the alias in func2, |> what's so magical about FORTRAN? It would seem to be in the same boat. That is exactly the point I made earlier about empowerment. C allows both subroutines, but most implementations pessimize both func2 and func3 because they can't tell whether their argument is aliased to thier global or not. Fortran also restricts the user, but not in speed. The Fortran compiler optimizes both func2 and func3 as if no aliasing has occurred - but then, the call in this example which causes aliasing in func2 is _ILLEGAL_. Since deliberate aliasing of this kind is rare (or can be made rare by coding around it), the Fortran rule permits the user to get better code generated. _NEITHER_ language empowers the user to _explicitly_ ask for aliasing or not as he pleases. C _could_ be optimized anyway, by having a tool that traces the identity of procedure arguments through the call tree. The best place to do this would be the loader, which has to do call tree analyses anyway. Of course, the same trick would allow Fortran to be extended to allow the call to func2 after all - the loader could in this case _inhibit_ the optimizations that Fortran usually performs. Without either an explicit declaration or a call chain analysis, there is no general way I know of to optimize where it's safe to do so, and not optimize when aliasing occurs. Last fall, someone on comp.lang.misc posted a method that relied on information about functions given in their .h files (this method would only worh for functions which _have_ .h files associated with them). However this method *as stated* would only find aliasing one-level deep in the call chain. The example I gave here would have fooled it. |> [...] |> QQQQuestion: The clarification I ask for above is actually a serious one, for |> I'd like to know if with my pointer freedom in C, I can pass arguments to a |> FORTRAN routine which will blow up, because of its optimizations based on |> restrictions in FORTRAN. Calling Fortran from C is safe as long as you remember not to pass pointers to the same object (or within the same object) as two different arguments to the Fortran routine and if you don't pass pointers to global objects that the Fortran routine can see. Actually, if the arguments that you pass are not modified by the Fortran routine, it's safe to alias them. But, it's safer not to alias anything if you can help it. J. Giles
jlg@cochiti.lanl.gov (Jim Giles) (04/29/91)
In article <4037@bruce.cs.monash.OZ.AU>, mmcg@bruce.cs.monash.OZ.AU (Mike McGaughey) writes: |> [...] |> Even with full MIPS IM optimisation, compilation order is independent |> (it creates .u files; if you apply their .u-to-symbolic translator to |> them, you'll find that they look suspiciously like RTL). Oh. Sorry (this will surprise thos who claim that I never admit to being wrong). As I said, I don't have access to MIPS. The capability, as it was described to me, made no mention of .u files and, in fact, claimed that the compiler did the IM analysis. If it is done by a post-translation tool, then it _is_ standard conforming. J. Giles
martelli@cadlab.sublink.ORG (Alex Martelli) (04/29/91)
burow@cernvax.cern.ch (burkhard burow) writes:
...
:QQQQuestion: The clarification I ask for above is actually a serious one, for
:I'd like to know if with my pointer freedom in C, I can pass arguments to a
:FORTRAN routine which will blow up, because of its optimizations based on
:restrictions in FORTRAN.
You can, but you can also blow up Fortran from itself quite easily, i.e. any
'CALL ROUT(X,X)' or 'COMMON/A/X ... CALL TINE(X)' *could* cause the blowup.
Such aliasing is FORBIDDEN in Fortran, but ENFORCING the prohibition is left
to mysterious runtime errors in most cases... just like, in C, free()ing a
pointer NOT obtained via malloc(), or, in any language, using an index out
of range to address an array (indeed, some Fortran compilers will optionally
do array-index-checking, and some malloc()/free() packages will check for
sanity, so I assume somewhere there must exist a compiler that will do some
checks and output error messages on aliasing, though I know of none).
:QQQQuestion: Lots of manuals have little sections on calling and linking
:different languages within the same program. How many people out there actually
:do this sort of thing? e.g. How many people out there use C, handling the messy
:jobs of i/o, user interface, etc., as a frontend to number crunching FORTRAN
:routines? It doesn't matter whether you call the FORTRAN routines because of
:performance reasons, or as in my case, because they exist and work.
I do it the other way 'round, i.e. use C as the backend (for graphics, OS
access, all sorts of algorithms-and-data-types sort of work) for Fortran
programs; it's always Fortran calling C, never the other way around. As the
fraction of C grows (mostly a reflection of the bad implementations of Fortran
on workstation-class machines, that we observe), SOME C->Ftn cases appear; and
I would indeed like some automated/C-macro approach to such interfacing, just
as I have built for C routines intended to be called from Fortran. The only
hard problem I see is passing CHARACTER arguments - it's a very convoluted
task in the Ftn->C direction (to keep it portable between 12+ platforms), I
just have not devoted enough work to the C->Ftn direction, so far. I have
seen your "cfortran" stuff, one month ago, but I have not delved deeply enough
into it to see how hard/easy it would be to generaliza it to Apollo, OS/2, HP,
Sun, IBM RS/6000, etc etc...
--
Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 53, Bologna, Italia
Email: (work:) martelli@cadlab.sublink.org, (home:) alex@am.sublink.org
Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434;
Fax: ++39 (51) 366964 (work only), Fidonet: 332/401.3 (home only).
rkowen@violet.berkeley.edu (Dario Bressanini) (04/30/91)
martelli@cadlab.sublink.ORG (Alex Martelli) writes: >>:Just in case.... I don't want to start the usual and useless war >>:C vs FORTRAN etc..,i would like to use C for my programs, but in most >>:cases i tried, the Code produced by the C compiler was 2 or 3 times >>:slower that the fortran code. >On what platforms? On workstation-class hardware, my experience is >just the reverse: despite all the theoretical advantages of Fortran, >which SHOULD be vastly easier to optimize for numerical computation >(no aliasing allowed...), I find most Fortran-generated code quite a >bit slower than C. Ciao Alex! I didn't know i was starting such a discussion with my first innocent message. Well, i tried on Mini systems (Gould, Convex ...). When i used f77 (the standard unix compiler) the performances were *VERY* bad, but when i switched to the vendor's compiler the fortran code was faster than the C code. I plan to try again on the RS-6000 and on a Personal IRIS to see if there is any difference. >>On PC's, it's even worse, as >>if compiler vendors barely cared for Fortran performance (particularly >>in bulk I/O) but were locked in a mad race for C performance, or >>something like that... I agree with you. Once i reduced the execution time of a fortran program on a IBM PC from 18 Hours to 2 by changing a single line! it was something like a = (expr)**3 I changed it in b = expr a = b*b*b et voila' (expr is a floating point expression). If the compiler cannot do this kind of optimizations, well, it is a very poor compiler. On my Atari ST, the FASTEST language is......compiled BASIC! (and now all comp.lang.c will flame me :-) ) >>modern C's should allow you to use >>single precision anyway where/when it matters (ANSI allows that, Sun C >>while non-ANSI offers a special option for this, etc). Anyway I'm not >>surprised if your results are on supers/mainframes, since apparently >>those guys are GOOD at writing Fortran compilers (maybe it is that they >>CARE about it!), while their WSs colleagues appear to be either less >>able or less motivated... Yes, with ANSI C you can use single precision, but it is almost USELESS in scientific computations (at least in my field, i.e. quantum chemistry) If I use single precision I cannot trust the results! FORTRAN and C belong to different worlds: C is for programmers while FORTRAN is for most of the scientific community. Most of the time a C programmer doesn't give a damn for the floating point performances, while sometimes, doing scientific computations, I even have to care about the performances of DOUBLE PRECISION COMPLEX. I would like to use a language which combines the good features of the two, but i guess I'll have to wait... >>Since when do you mind 'pure void theoric discussions', DB...?-) {Personal >>joke warning, me and DB are old fidonet friend/enemies!-} things change.... rob@mckinley.eng.ohio-state.edu (Rob Carriere) writes: >>>Right now, in the Real World (TM), optimized Fortran really does run faster >>>than optimized C. A situation that truly sucks. >>It seems to me that, like mister Giles, you are confusing properties of the >>language with properties of the implementation. It would sure be nice to have >>those wonderful compilers today, but their presence or absence has nothing to >>do with the merits or lack thereof of the language C. I don't agree with this. We can talk forever about properties of a language, but when you write a program you have to deal with the actual implementation. I am glad to know that a language has certain properties, but if the implementation doesn't use them, well, they don't count for me. If my fortran program, that takes 100 hours on a CRAY, can takes 80 hours if rewritten in C, I'll start coding immediately, but if you say that i should judge C from the fact that in theory i could have a program that takes 80 hours, and not from the fact that the actual C implementation would take longer that 100 hours, well... dario bressanini rkowen@violet.berkeley.edu
throopw@sheol.UUCP (Wayne Throop) (04/30/91)
> jlg@cochiti.lanl.gov (Jim Giles) > In NO present implementation that does IM analysis > is the method standard conforming. To quote Inigo Montoya: "You keep using that word. I do not think it means what you think it means." In fact an implementation that uses inter-module analysis to get superior optimization of cases that can be proven non-aliased may very well be standard conforming. Certainly the inter-module optimizations of the MIPS C compiler aren't what (if anything) keep it from being standard conforming. In just what way are inter-module optimizations supposed to extend or violate the standard? As far as I can tell, they do neither. ( Note that these inter-module optimizations can be performed within the usual capabilities of current Unix .o and .a file formats, at the cost of a tradeoff in multiple, differently-optimized translations output from the compile phase. In much the same way as the inter-module type checking is done within this restriction by C++. Not that there aren't better ways available if you allow the introduction of a better intermediate form. It's just that this conservative approach (while not totally satisfactory), would work and yield 99 and 44/100ths percent of the benefits Jim is looking for. ) -- Wayne Throop ...!mcnc!dg-rtp!sheol!throopw
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/01/91)
In article <22839@lanl.gov>, jlg@cochiti.lanl.gov (Jim Giles) writes: > That is exactly the point I made earlier about empowerment. C allows > both subroutines, but most implementations pessimize both func2 and > func3 because they can't tell whether their argument is aliased to > thier global or not. Fortran also restricts the user, but not in > speed. The Fortran compiler optimizes both func2 and func3 as if > no aliasing has occurred - but then, the call in this example which > causes aliasing in func2 is _ILLEGAL_. Since deliberate aliasing > of this kind is rare (or can be made rare by coding around it), the > Fortran rule permits the user to get better code generated. Here I think we have come to the core of the problem. Friends, let's face it, Jim Giles' main point (that many currently available C compilers miss chances to generate better code because they think pointers _might_ be aliased) IS VALID. There are some really good compilers out there (Plan 9 sounds _wonderful_) but there are some not-so-good compilers, and especially on the top end machines that I suspect Jim Giles cares most about, Fortran compilers really do generate much better code than C. (There are other reasons for using Fortran on top end machines. For example, a certain mini-super manufacturer making what's basically an improved DAP lets the number of processors in the actual machine "show through" in C, but gives you however many "logical processors" you want in Fortran. Ok, fair enough for C to have a way of getting at the hardware directly, but there is no advantage to customers in forbidding "virtual processors" to C programmers.) The key difference is -- C permits aliasing. The price of this is that in the absence of good interfile analysis, you get inferior optimisation. This is a Quality of Implementation issue, because interfile analysis _can_ be done without violating the standard. -- Fortran forbids aliasing. The price of this is that in the absence of good interfile analysis, the fact that the code that has been generated is WRONG because there _is_ aliasing goes quietly un-noticed. Let me show you a typical example of aliasing in C: struct ListNode { struct ListNode *next; int item; } void ListDelete(struct ListNode **p; int item) { struct ListNode **v = p; struct ListNode *q; for (q = *p; q != NULL; q = q->next) if (q->item != item) *v = q, v = &(q->next); *v = q; } ... ListDelete(&var); ... At various times we have *v and *p as names for the same variable, **p and *q as names for the same variable, **v and *q as names for the same variable, and so on. This is aliasing. C pointers would be almost useless for managing linked structures unless "aliasing allowed" were the default. (This is not a defence of C pointers. There may well be better ways of accomplishing the same ends.) Fortran, however, _used_ not to have pointers. I know that Fortran Extended _has_ pointers, but not having a current draft, I don't know what form they now take. Can Fortran Extended pointers be used for this sort of linked structure manipulation, and what does that do to Fortran's prohibition of aliases? It should also be borne in mind that the prohibition of aliases in the Fortran standard used to be widely ignored by Fortran programmers. For example, consider matrix addition: subroutine matadd(a, b, c, m, n) integer m, n real a(m,n), b(m,n), c(m,n) integer i, j do 20 j = 1, n do 10 i = 1, m a(i,j) = b(i,j) + c(i,j) 10 continue 20 continue end I've lost count of the number of books I've seen which tell you that it is ok to call a subroutine like this as call matadd(a, a, c, m, n) or call matadd(a, b, a, m, n) This is _precisely_ the kind of aliasing that Fortran prohibits. The effect of such a call is undefined. A Fortran compiler is perfectly entitled to generate code that starts out if (iaddr(a).eq.iaddr(b)) call system('rm *') The corresponding C code is required to work. If Jim Giles accompanied his criticism of C for allowing aliasing with a swipe at Fortran compiler vendors whose products do not by default warn users when they commit the "error" of aliasing, then I'd be impressed by his fairness. (For what it's worth, I have no objections to using Fortran, and would _love_ to have access to a Fortran Extended compiler.) -- Bad things happen periodically, and they're going to happen to somebody. Why not you? -- John Allen Paulos.
jlg@cochiti.lanl.gov (Jim Giles) (05/01/91)
In article <5475@goanna.cs.rmit.oz.au>, ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes: |> [...] Can Fortran Extended pointers be used for this |> sort of linked structure manipulation, [...] Yes. |> [...] and what does that do to Fortran's |> prohibition of aliases? Shoots it all to hell. But, there is a very important difference. Fortran does _not_ automatically convert array reference arguments into pointers when you make a function call. So, array calculations can still be compiled to efficient code without intermodule analysis. |> [...] |> subroutine matadd(a, b, c, m, n) |> [...] |> a(i,j) = b(i,j) + c(i,j) |> [...] |> I've lost count of the number of books I've seen which tell you that |> it is ok to call a subroutine like this as |> call matadd(a, a, c, m, n) |> or call matadd(a, b, a, m, n) I've not seen any books that recommend that. I have seen books that recomment _against_ it. There is an aliased call that is legal in Fortran: call matadd(a,b,b,m,n) It's safe to alias the second and third arguments because neither of them is assigned to by the procedure. Most books on Fortran recommend that you don't do even this, since you can't be sure and it's better to be safe. Numerical Recipes (the one for Fortran) has _one_ program in it which recommends an example of unsafe aliasing - but, in their case, so many of their algorithms need work it is not surprising that they make such an error. |> |> If Jim Giles accompanied his criticism of C for allowing aliasing |> with a swipe at Fortran compiler vendors whose products do not by |> default warn users when they commit the "error" of aliasing, then |> I'd be impressed by his fairness. Quite so. I have always criticized Fortran implementations for this. Especially since it does not require a run-time test. If the compiler were to put the proper information into the object code, the loader could propagate the identities of procedure arguments through the call chain and detect all such illegal aliasing at load-time. It is, in fact, almost the same algorithm that I've been recommending for use to optimize C code at load-time. J. Giles
gwyn@smoke.brl.mil (Doug Gwyn) (05/02/91)
In article <1991Apr29.213922.27863@agate.berkeley.edu> rkowen@violet.berkeley.edu (Dario Bressanini) writes: >Well, i tried on Mini systems (Gould, Convex ...). When i used >f77 (the standard unix compiler) the performances were *VERY* bad, >but when i switched to the vendor's compiler the fortran code >was faster than the C code. Note that most UNIX "f77" and "cc" compilers use the exact same compiler technology, including a common code generator. For a fair comparison of the properties of application implementation in different languages, that is appropriate. Certainly, switching to a more highly tuned compiler would produce faster generated code, but that would be true for C as well as for Fortran. Another relevant factor may be that the UNIX system implementors were not as concerned about obtaining optimal code for Fortran, since the vast majority of UNIX is implemented in C; therefore, even using the same technology the C compiler variant may have been somewhat more highly tuned than the Fortran variant. Anecdotal evidence comparing compiler performance is thus rather useless in an argument about the run-time efficiency for programs coded in different languages. Note also that the very criterion is by no means the most important one for selection of a programming language.
burley@pogo.gnu.ai.mit.edu (Craig Burley) (05/02/91)
In article <22983@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: In article <5475@goanna.cs.rmit.oz.au>, ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes: |> [...] Can Fortran Extended pointers be used for this |> If Jim Giles accompanied his criticism of C for allowing aliasing |> with a swipe at Fortran compiler vendors whose products do not by |> default warn users when they commit the "error" of aliasing, then |> I'd be impressed by his fairness. Quite so. I have always criticized Fortran implementations for this. Especially since it does not require a run-time test. If the compiler were to put the proper information into the object code, the loader could propagate the identities of procedure arguments through the call chain and detect all such illegal aliasing at load-time. It is, in fact, almost the same algorithm that I've been recommending for use to optimize C code at load-time. Uh, maybe "all obvious cases of illegal aliasing", but not all cases. Remember that whether a given definition occurs, even of a dummy argument, is entirely a run-time decision. I.e. the statement "Dummy argument X is defined by MATADD" is not really the opposite of the statement "Dummy argument X is not defined by MATADD". Instead, the statements must read as follows to be opposites: "Dummy argument X is known to be defined during any invocation of MATADD"; "Dummy argument is not known to be defined...". (Another way to explain: Fortran 90 has the INTENT(IN), INTENT(OUT), and INTENT(INOUT) specifications for dummy arguments to indicate whether a given dummy argument is (IN) possibly referenced but never defined; (OUT) possibly defined but never referenced prior to an initial definition; (INOUT) possibly defined and possibly referenced. However, the default is not INTENT(INOUT) as might be suspected, because a given dummy is not necessarily known to be possible referenced and/or possibly defined at compile time, and THAT is the default. For a default-INTENT dummy, any actual argument may be passed (essentially); for INTENT(INOUT), only a definable and defined actual argument may be passed, and so on.) Obviously, if the decision as to whether a dummy gets defined or even referenced is made at run-time, then it is impossible for a compiler or linker/loaded to do a complete check as to whether aliasing errors have been coded into the application. That having been said, I do think 99% or more of all aliasing errors comitted in Fortran code are detectable (without falsing) at the link/load phase (i.e. more than 99% are provable and obvious). Oh, this assumes that reporting a given invocation as violating an aliasing requirements is not a "false" if that invocation wouldn't actually get called during a given invocation of the program as a whole. I.e. if one gets a message that "CALL MATADD(A,B,B,N)" does invalid aliasing because the 3rd argument of MATADD gets modified, it shouldn't be considered a "false" report if the CALL itself happens to never (or rarely) get executed. (In fact, this is why I like compile/load-time checking, as long as it doesn't report further levels of falses -- one gets checking that, if only performed at run-time, might find cases one might never find by just running the program. Cases like what happens when particular data sets are used, for example.) -- James Craig Burley, Software Craftsperson burley@gnu.ai.mit.edu
jlg@cochiti.lanl.gov (Jim Giles) (05/02/91)
In article <BURLEY.91May1153042@pogo.gnu.ai.mit.edu>, burley@pogo.gnu.ai.mit.edu (Craig Burley) writes: |> [...] |> Obviously, if the decision as to whether a dummy gets defined or even |> referenced is made at run-time, then it is impossible for a compiler or |> linker/loaded to do a complete check as to whether aliasing errors have |> been coded into the application. True. The test I had in mind would signal a message whenever an instance of aliasing occurred which _might_ be illegal. That is, if either of the aliased entities is _possibly_ assigned to within the procedure (this is a simple thing for the compiler to determine - the variables either appear on the left of an assignment _somewhere_ or not). In this respect, I simply assume that the possible will happen. The bottom line is that any variable with the OUT or INOUT attribute (or, one that would have that attribute if you made an interface for it) must not be aliased or I issue a message. |> [...] |> That having been said, I do think 99% or more of all aliasing errors comitted |> in Fortran code are detectable (without falsing) at the link/load phase |> (i.e. more than 99% are provable and obvious). I agree. This is also true of all C code except where pointer assignment has been performed. Then the issue gets muddy indeed. The only safe answer is that the pointers involved are now _maybe_ aliased to all the objects in the transitive closure of assignments to which they belong. This is also simple for the compiler to determine, but will result in a lot of alias warnings that aren't actually true. Fortran pointers will suffer the same problem. The best solution is to extend the languages to allow the user to explicitly declare which pointers are allowed to be aliased and to what. Then the compiler can actually enforce the constraints (at compile time). This gets back to the issue of "empowerment". Giving the user explicit control is empowerment. Automatically making the decision for him is not. Though, if an automatic method _mostly_ does what you want, you may be satisfied to accept the default most of the time - just as long as you are given the power to override the default when you want to. With respect to aliasing, neither Fortran nor C give you that power. Using the loader in the manner I've described would give the user feedback about how to use that power most effectively - and maybe, therefore, convince implementors and language designers to provide the necessary features. J. Giles
clive@x.co.uk (Clive Feather) (05/02/91)
In article <5475@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes: > The key difference is > -- C permits aliasing. The price of this is that in the absence of > good interfile analysis, you get inferior optimisation. This is > a Quality of Implementation issue, because interfile analysis > _can_ be done without violating the standard. > -- Fortran forbids aliasing. The price of this is that in the > absence of good interfile analysis, the fact that the code that > has been generated is WRONG because there _is_ aliasing goes > quietly un-noticed. [He's not the only one to do so, but this message was to hand when I finally decided to contribute to this thread] The standard (3.9.6) says: "The use of two parameters declared with an array type (prior to their adjustment to pointer type) in separate lvalues to designate the same object is an obsolescent feature." Looking at this and the rationale, it seems to me that the intention of the committee was: void mangle_two_arrays (double *a, double *b, size_t n) { /* The compiler must generate code which will work if a and b point to the same object, or if a == b + 1, etc. In other words, traditional C semantics. Future versions of the standard will retain these semantics. */ } void mangle_two_arrays (double a [], double b [], size_t n) { /* At present this has the same semantics as above. However, a future version of the standard will probably change the semantics of this procedure to those in the following comment; compiler implementers should look at this as a reasonable extension, and programmers should use this notation when they want these semantics. */ /* The compiler is entitled to assume that a and b point to different objects, and having them point to the same object will cause undefined behaviour. In other words, Fortran semantics. */ /* If this procedure only accesses a [0] to a [n-1] and b [0] to b [n-1], it is presumably safe for b == a + n. However, it is undefined what happens if b == a + n - 1, even if the code is such as the following. */ size_t i; for (i = 0; i < n; i++) b [i] = sin (a [i]) - cos (b [i]); } What I am wondering is: (1) am I reading the standard correctly ? (2) why hasn't Chris Torek (or Doug Gwyn or Henry Spencer or Mark Brader or ...) mentioned this before ? (3) is Chris wrong (shock, horror; I can sense the flames coming at me already :-) when he says that declaring a parameter as an array is always a Bad Thing (TM) ? -- Clive D.W. Feather | IXI Limited | If you lie to the compiler, clive@x.co.uk | 62-74 Burleigh St. | it will get its revenge. Phone: +44 223 462 131 | Cambridge CB1 1OJ | - Henry Spencer (USA: 1 800 XDESK 57) | United Kingdom |
sef@kithrup.COM (Sean Eric Fagan) (05/03/91)
In article <22841@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >If it is done by a >post-translation tool, then it _is_ standard conforming. Why do you have this hangup about "post-translation"? ANSI says *nothing* about how code is produced, yet, obviously, the code you write must be executed somehow. Some systems, such as Saber-C, will execute as they translate (or, perhaps, do a quick translation and then execute the intermediate results). Others can require you to type in 42 commands to produce an executable that you can then run: each of those 42 steps is part of the "translation" process as, jim so ignorantly puts it. The unix linker is, essentially, part of the "translation" process for C on unix. So there is no "post-translation" optimisation. -- Sean Eric Fagan | "I made the universe, but please don't blame me for it; sef@kithrup.COM | I had a bellyache at the time." -----------------+ -- The Turtle (Stephen King, _It_) Any opinions expressed are my own, and generally unpopular with others.
jlg@cochiti.lanl.gov (Jim Giles) (05/03/91)
In article <1991May02.173643.9528@kithrup.COM>, sef@kithrup.COM (Sean Eric Fagan) writes: |> [...] |> The unix linker is, essentially, part of the "translation" process for C on |> unix. So there is no "post-translation" optimisation. You need to read your copy of the standard more carefully. The C standard _specifically_ make a distinction between translation and linking. Source 'files' may be 'translated' in any order and separately and subsequently 'linked' together to form a program. Define these terms how you like, but don't expect any vendors to agree with you. In the meantime, what the standard has in mind is perfectly clear: separate compilation. J. Giles