wilson@uicbert.eecs.uic.EDU (Paul Wilson) (09/15/88)
I've been thinking about using dynamic compilation for Scheme, and wondering if anybody else is. Have people abandoned the idea of dynamic compilation, and if so, was it premature? Some of the advantages might be: 1) portability images could still run fast 2) an optimizer could be called in to optimize those chunks of code that are ACTUALLY executed a bunch of times. (rather than just optimizing the time/space tradeoff between interpreted and compiled code, it could be used to adjust compile time/run time tradeoffs.) 3) Transparent, automatic compilation and optimization. In particular, I'm thinking about a scheme in which inlining would be done automatically, but dependencies would be recorded. If somebody changed the value of an inlined procedure or constant, the code depending on those values would be invalidated and dynamically recompiled at runtime. (The mechanism for this depends on near-lexical-scoping. Procedures that are never redefined in a lexically apparent way could be inlined. Primitives that can modify things (like first-class environments) in non-lexically-apparent ways would have to check for dependencies.) Anybody care to comment on this idea? In particular, is there some irreducible cost of dynamic compilation that I don't know about? Any advantages? (It seems to me that you might dynamically determine what types of arguments frequently-executed procedures were frequently called with. You could then compile special versions for the common combinations. A large-grained check would dispatch to the proper version, within which most type checking could be eliminated using type inference. Would this be workable/worthwhile?) Paul R. Wilson Human-Computer Interaction Laboratory U. of Illin. at C. EECS Dept. (M/C 154) wilson%uicbert@uxc.cso.uiuc.edu Box 4348 Chicago,IL 60680
adg@cl.cam.ac.uk (Andy Gordon) (09/20/88)
In article <8809142159.AA28551@uicbert.eecs.uic.edu> wilson@uicbert.eecs.uic.EDU (Paul Wilson) writes: >I've been thinking about using dynamic compilation for Scheme, >and wondering if anybody else is. Have people abandoned the idea >of dynamic compilation, and if so, was it premature? Firstly, by "dynamic compilation" I understand a system that represents a programme to start with as a syntax tree or some other form suitable for interpretation, but compiles parts of it into native code as execution proceeds. Interpreted code is generally smaller than compiled code so less space is needed than compiling the whole thing. I worked on a related problem for my honours thesis---how to interface compiled and interpreted code in the Edinburgh Standard ML compiler. In the system I built the programmer indicated by annotations for each function whether it was to be compiled or interpreted, but I did have in mind a scheme that would do the annotations automatically, based on data from previous runs, or even dynamically converting interpreted functions into native code if they were frequently executed. I found on small benchmarks that native code was between four and six times bigger than interpreted byte codes, and ran between one and seven times faster. Let's call a system that mixes compiled and interpreted code a "hybrid" system, and then distinguish between "static" and "dynamic" hybrids, i.e., between those in which the mode of each function declaration is determined at compile time and those in which it can change at run time. So I built a static hybrid system, and I think you plan to build a dynamic hybrid system. There appear to be two reasons for hybrid systems: (1) to give a variable time/space tradeoff, i.e., between fast/bulky native code and slow/lean interpreted code; (2) to allow fancy interpretive debuggers and tracers in the presence of native code. I don't think reason (1) is very compelling these days, because the size of compiled code is not an issue with today's computers---certainly, big ML programmes that are compiled to native code by one of the production compilers don't take up inordinate space. However, reason (2) is still attractive, because interpreted code will always be easier to debug than compiled code. A dynamic hybrid system is all the work of a static one, and more, so for it to be worthwhile it must make debugging much easier. One can imagine a debugger threading through a programme, seeking to use interpreted versions of compiled functions as it finds them---maybe you could keep a pointer in each compiled function to an interpreted version or the source. I'd be interested to hear how you get on. Here are some references you might find useful when implementing your scheme. I published a technical report [1] that has a comprehensive literature survey on hybrid compilation/interpretation, and presents a taxonomy of ways to switch between compiled and interpreted modes. I have a few copies of it that I could post to anyone who is interested (email me). Other papers that might be of interest: Brown on "throw away compiling" [2], Low on automatically selecting an appropriate representation of an abstract data type depending on usage statistics [3] and Dakin and Dawson on a text editor that mixes compilation and interpretation, together with performance graphs [4]. [1] A. D. Gordon, "How to Breed Hybrid Compiler/Interpreters", Technical report ECS--LFCS--88--50, Laboratory for the Foundations of Computer Science, Computer Science Dept., University of Edinburgh, Edinburgh, EH9 3JZ, Scotland. April, 1988. [2] P. J. Brown, "Throw-away compiling", SWPE, 6, pages 423--434, 1976. [3] J.R. Low, "Automatic data structure selection: an example and overview", CACM, 21(5), pages 376--385, May 1978. [4] R. J. Dakin and P. C. Poole, "A mixed code approach", The Computer Journal, 16(3), pages 219--222, 1973.
ACW@IVORY.S4CC.SYMBOLICS.COM (Allan C. Wechsler) (09/22/88)
Date: 19 Sep 88 20:36:51 GMT From: mcvax!ukc!cam-cl!adg@uunet.uu.net (Andy Gordon) [...] Interpreted code is generally smaller than compiled code so less space is needed than compiling the whole thing. This guy's whole approach is based on this premise. Does anyone want to disabuse him. For a simple recursive factorial, for example, I get about 25 words interpreted, and about ten compiled. I don't know what happens with large programs, but I suspect they show about the same ratio.
Krulwich-Bruce@cs.yale.edu (Bruce Krulwich) (09/23/88)
In article <327@scaup.cl.cam.ac.uk>, adg@cl (Andy Gordon) writes: >In article <8809142159.AA28551@uicbert.eecs.uic.edu> >wilson@uicbert.eecs.uic.EDU (Paul Wilson) writes: >>I've been thinking about using dynamic compilation for Scheme, >>and wondering if anybody else is. Have people abandoned the idea >>of dynamic compilation, and if so, was it premature? > >Firstly, by "dynamic compilation" I understand a system that >represents a programme to start with as a syntax tree or some other form >suitable for interpretation, but compiles parts of it into native code >as execution proceeds. Interpreted code is generally smaller than compiled >code so less space is needed than compiling the whole thing. I think you're overlooking several possible program transformations that could be done in addition to straightforward compilation, such as: (1) Inlining functions and de-inlining them when necessary (2) Different levels of compiled-code optimization (3) Caching of results (4) Special case handling of parameters by type (5) Analysis of memory use to reduce garbage generation (6) Dynamic analysis of ordering of comparisons >There appear to be two reasons for hybrid systems: (1) to give a variable >time/space tradeoff, i.e., between fast/bulky native code and slow/lean >interpreted code; (2) to allow fancy interpretive debuggers and tracers >in the presence of native code. > >I don't think reason (1) is very compelling these days, because the size of >compiled code is not an issue with today's computers---certainly, big ML >programmes that are compiled to native code by one of the production compilers >don't take up inordinate space. As I said above, I would bet that this does become an issue when more possible transformations are considered. >However, reason (2) is still attractive, because interpreted code will always >be easier to debug than compiled code. A dynamic hybrid system is all the >work of a static one, and more, so for it to be worthwhile it must make >debugging much easier. One can imagine a debugger threading through a >programme, seeking to use interpreted versions of compiled functions as it >finds them---maybe you could keep a pointer in each compiled function to an >interpreted version or the source. I'd be interested to hear how you get on. This is a good point. Similar uses might be the ability to redefine a component of a function without recompiling the entire function, or the ability to optimize only the inner loops of a function, Bruce Krulwich
hbs@LUCID.COM (Harlan Sexton) (09/23/88)
I wouldn't be surprised to find interpreted representations bigger than machine-code. However, you can usually get byte-code to be smaller than machine-code by factors of about 4 to 10 (higher for RISC machines, in my limited experience), so compiling to byte-codes can win. Also, in spite of what someone said, memory is NOT so cheap that it doesn't matter how big the code for an application is, and I can imagine that it will never be so for certain classes of applications. --Harlan