Tom.Lane@ZOG.CS.CMU.EDU (08/22/88)
> Does anyone else think that converting a series of digits into an integer > is inappropriate for a lexical analyser? It seems to be a very common > thing to do, but I can see practically no advantages to it, and several > disadvantages. The main reason for converting constants to binary is so the compiler can do arithmetic on them. Somebody already mentioned constant folding, but nobody has yet pointed out the most crucial case where the compiler must do this: where the constants in question are array subscript bounds. You *must* do arithmetic at compile time to do storage allocation! (Unless you want to use a dope vector and run-time storage allocation for every array, which is mighty expensive.) I once worked on a cross-compiler that ran on a 16-bit-integer machine but produced code for a 32-bit-integer machine. Integer constants smaller than 32k were converted to binary, but we left larger ones in text form until the assembly pass. Users weren't allowed to declare arrays of more than 32k elements... [That compiler also left floating point constants in text form, mainly for accuracy reasons: the machines' floating point formats differed.] -- tom lane Internet: tgl@zog.cs.cmu.edu UUCP: <your favorite internet/arpanet gateway>!zog.cs.cmu.edu!tgl BITNET: tgl%zog.cs.cmu.edu@cmuccvma -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
markhall@pyramid.pyramid.com (Mark Hall) (08/30/88)
In article <2299@ima.ima.isc.com> Tom.Lane@ZOG.CS.CMU.EDU writes: > >The main reason for converting constants to binary is so the compiler >can do arithmetic on them. Somebody already mentioned constant folding, >but nobody has yet pointed out the most crucial case where the compiler >must do this: where the constants in question are array subscript >bounds. You *must* do arithmetic at compile time to do storage >allocation! (Unless you want to use a dope vector and run-time >storage allocation for every array, which is mighty expensive.) > >[stuff deleted] Users weren't allowed to declare >arrays of more than 32k elements... [stuff deleted] > If anyone out there is writing a compiler from scratch, please don't do what is described above. Whenever compile-time arithmetic is required, calls should be made to a retargetable module which can carry out the computation in the same precision and semantics of the target machine. If the host has more precision than the target, then the representation possibly `must' be strings, but it's far from impossible to do arithmetic on strings! Other possibilities exist. One might be able to successfully represent target `int's using host double-precision floating point (this worked for one host-target pair that I wrote a compiler for). On pg. 97 of: %A William Waite %A Gerhard Goos %T Compiler Construction %I Springer-Verlag %C New York, NY %D 1984 there is a more elaborate description of how (and why) this can be done. You might not think your compiler will get targeted to another product line, but just when you least suspect it, the CEO will drop in and insist that it be done in 1 month! [From markhall@pyramid.pyramid.com (Mark Hall)] -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
tgl@zog.cs.cmu.edu (Tom Lane) (09/05/88)
In article <2299@ima.ima.isc.com> I wrote: >I once worked on a cross-compiler that ran on a 16-bit-integer machine >but produced code for a 32-bit-integer machine. Integer constants >smaller than 32k were converted to binary, but we left larger ones in >text form until the assembly pass. Users weren't allowed to declare >arrays of more than 32k elements... In article <2370@ima.ima.isc.com>, markhall@pyramid.pyramid.com (Mark Hall) replied: >If anyone out there is writing a compiler from scratch, please >don't do what is described above. Whenever compile-time >arithmetic is required, calls should be made to a retargetable >module which can carry out the computation in the same precision >and semantics of the target machine. If the host has more [less? TL] >precision than the target, then the representation possibly >`must' be strings, but it's far from impossible to do arithmetic >on strings! I agree *in principle*. In practice there are some other considerations. The compiler I described was a bootstrap system, which we fully intended to scrap once we had a stable development platform on the target hardware. In that context, building a multiple-precision integer arithmetic package just wasn't worth the effort; the compiler's (not very severe) restrictions could be lived with. For a production cross-compiler, it would make sense to do things as Hall suggests. Note that the implications are extensive: for example, the offsets to local variables in a procedure's stack frame would need to be target-integers. Thus doing it right impacts the compiler's symbol table, as well as virtually all aspects of code generation. In currently popular systems programming languages, the notational inconveniences alone would be a tremendous problem ("add(x,convert_int(1))" instead of "x+1"). Dealing with floating-point arithmetic is much harder. For instance, I recall reading horror stories about early Fortran systems in which compile-time conversion of a floating point constant could give a different result than run-time input of the same character string. A cross-compiler that does constant-expression folding is going to have a very hard time ensuring that it gets exactly the same result as run-time evaluation would. (This may get easier in future, as more machines are built to the IEEE floating-point standards.) The only good aspect of the situation is that cross-compilers are usually used for development of systems software, in which optimization of floating point arithmetic isn't much needed. Therefore, the problem can be bypassed by passing F.P. constants through in text form, and not attempting to precompute any constant F.P. expressions... which is exactly what we did, as did some other recent posters. (Then the only problem is correctly converting F.P. constants to bit strings in the cross-assembler.) The article that started this discussion proposed the pass-through, "hands-off" approach for *all* constants, integer as well as floating point. The point I tried to make is that the semantics of programming languages often require the compiler to do calculations with integer constants; so the fully hands-off approach is not workable. (Compile-time floating-point operations are never required in Pascal or C; I'm not too sure about Ada.) Hall's point is that having to do calculations does not mean having to assume that host-integers are the same as target-integers. This is a valid point, and is probably the right attitude to take in a production quality cross-compiler; but the cost is not trivial. -- tom lane Internet: tgl@zog.cs.cmu.edu UUCP: <your favorite internet/arpanet gateway>!zog.cs.cmu.edu!tgl BITNET: tgl%zog.cs.cmu.edu@cmuccvma -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
wendyt@pyrps5.pyramid.com (Wendy Thrash) (09/09/88)
In article <2492@ima.ima.isc.com> tgl@zog.cs.cmu.edu (Tom Lane) writes: >A cross-compiler that does constant-expression folding is going to have >a very hard time ensuring that it gets exactly the same result as run-time >evaluation would. (This may get easier in future, as more machines >are built to the IEEE floating-point standards.) . . . >[T]he problem can be bypassed by passing F.P. constants through in text form, >and not attempting to precompute any constant F.P. expressions . . . (Then >the only problem is correctly converting F.P. constants to bit strings in the >cross-assembler.) Actually, IEEE 754 raises new questions about compile-time floating point while it's answering some old ones. For example, since rounding mode can affect the value of a conversion, and rounding mode can be set at runtime (though not easily, in most languages) one could argue that conversions from character strings (e.g. 1.0) into f.p. numbers (e.g. 0x3f800000) should be done at runtime, not at compile or assembly time. As for performing f.p. arithmetic on constants at compile time, I have mixed feelings. It's true that constant folding could clear up garbage left over from the use of #define, but it certainly defeats any attempts I may be making to control rounding mode. Moreover, I'm concerned about the application of arithmetic "identities" at compile time: if I write y = (x - 1.0) + 1.0; there's a very good reason for it, and I don't want the compiler to mess it up, no matter what it is allowed to do by the language definition. Please, at least honor my parentheses in floating-point computations. If you're ignoring parentheses in the course of optimization, give me a way to stop you from doing that, without disabling optimization completely. Remember that f.p. numbers are not quite real numbers. For example, double x; if (x != x) do_something(); can result in a call to do_something() if x is a NaN (IEEE 754 not-a-number). Floating-point code is strange stuff. Many battles are yet to be fought between compiler writers and f.p. users. (See the recent discussion begun by David Hough's comments on ANSI C in comp.lang.c.) Please take care not to optimize f.p. codes into meaninglessness. [From wendyt@pyrps5.pyramid.com (Wendy Thrash)] -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request