jrk@s1.sys.uea.ac.uk (Richard Kennaway CMP RA) (11/18/88)
In article <8811161428.AA13775@toucan.LCS.MIT.EDU>, bard@THEORY.LCS.MIT.EDU writes: > > I thought that the reason for using supercombinators rather than S and K is > space. Ordinary \-calculus expressions can be translated into SK-combinatory > expressions, but the combinatory version is exponentially longer than the SK > term. Some sets of supercombinators give polynomial-length expansions; I > think some give linear expansions. Here's a bunch of references for the complexity of various combinator translations. I didnt see all of this discussion, so my apologies if I'm repeating things that have already been said here. Yes, SK is exponential, which is why (as far as I know) noone has ever used it for a machine. Turner gave a translation into S, K, B, C, and I (Bxyz = x(yz), Cxyz = xzy, Ix = x), (Software P&E, vol9, 1979), which is cubic in the worst case. He then added three more combinators S'wxyz = w(xz)(yz), B'wxyz = wx(yz), and C'wxyz = wxzy, (J.Symb.Logic, vol.44, 1979) which is quadratic in the worst case. Counting arguments show that the best you can do with any fixed set of combinators is nlogn. This can be achieved (Noshita, Inf.Proc.Letters, vol.20, 1985, and - ahem - Kennaway, Inf.Proc.Letters, vol.24, 1987) though first run-length encoding the runs of identical combinators that Turner's second translation tends to give, then programming the run-lengths in combinators. The last step is purely of theoretical interest, in practice you might as well directly implement the run-length encoded version, or the closely related "director strings" (Kennaway, op.cit, and Kennaway&Sleep, ACM Toplas, October 1988), as has been done on the SKIM2 machine (Stoye, Proc. Functional Programming Workshop, Goteborg, 1985), to obtain a modest improvement in speed of evaluation of expressions, compared with Turner's second translation. (This is probably more important than getting the code size down - in practice I doubt the quadratic effects are significant.) Burton (Inf.Proc.Letters, vol.14, 1982) claims a linear translation, but this depends on a different measure of the size, and when measured on the same basis as the other references, it's also nlogn. The basic idea embodied in Turner's second translation, and in director strings, can also be found in an unpublished note which Dijkstra wrote on seeing Turner's first translation (EWD735, 1980), but which no-one, other than one anonymous referee, seems to have heard of. There's also the related BC-chain translation, which is a linear-space representation of Turner's second translation (Noshita & Hikita, 1984, RIMS Symp. on Sofware Science & Engineering, Kyoto, 1984 - I'm sure there's a more accessible reference, but I cant find it). Evaluation of BC-chain expressions takes linear time relative to evaluation of Turner (extended set) combinators. I dont know what the constant factors look like. Of course, given an nlogn translation into some fixed finite set of combinators, you immediately have an nlogn translation into any finite set capable of expressing a translation, even pure SK (compose the first translation with a translation of the first set into the second - only a linear expansion), though of course that's another purely theoretical result - a bit like producing a RISC compiler by combining a CISC compiler with a RISC emulator... Supercombinators (Hughes, thesis, Oxford, 1984, and umpteen papers by various people since then) are nlogn in the average and worst case. Their advantage is not in the size of the translation but in speed of execution, due to the fact that each supercombinator embodies a larger chunk of computation, and that the supercombinators you get are tailored to the program being compiled, providing more opportunities for clever implementation. (Or so I assume - has anyone compared the actual performance of something like SKIM2 running director strings with a supercombinator machine?) > Now, code size isn't usually much of a problem, in that we don't > usually care whether a program is 50K or 250K. :-) Unless you're running from floppies...(sorry, was thinking of comp.sys.mac) > The difference between 50K > and 2^50K is significant. I don't know if the translation has a decent > expected case or not. It's difficult to define the "expected case" - depends on the probability distribution of the programs you compile. What is a "typical" program? For the nlogn translations it doesnt matter - once your worst case is down to nlogn, the counting argument shows that must be the average case as well, but for Turner's quadratic translation, you have to look at what sort of programs produce the worst case. Worst-case lambda-expressions for this translation have deeply nested subexpressions and functions of many arguments - the nesting has to be proportional to the size of the program to get the quadratic effects, e.g. (backslash = lambda): \x1.\x2.\x3.\x4.\x5.(x5 ((x2 (x1 x4)) x3)) Does this ever happen? > In any event, the SK compiler has a lot of work to do before it can match > even a fairly stupid supercombinator compiler, simply because it can be > forced to produce incredibly long code. Is this a particular SK compiler you're talking about? Has someone actually implemented the naive translation into just S and K? I'm not surprised it's no good. > -- Bard Bloom -- Richard Kennaway School of Information Systems, University of East Anglia, Norwich, U.K. uucp: ...mcvax!ukc!uea-sys!jrk Janet: kennaway@uk.ac.uea.sys
jrk@s1.sys.uea.ac.uk (Richard Kennaway CMP RA) (11/19/88)
Correction: Kennaway, Inf.Proc.Letters, vol.24, 1987, should have been Kennaway&Sleep, Inf.Proc.Letters, vol.24, 1987. -- Richard Kennaway School of Information Systems, University of East Anglia, Norwich, U.K. uucp: ...mcvax!ukc!uea-sys!jrk Janet: kennaway@uk.ac.uea.sys