markv@uoregon.uoregon.edu (Mark VandeWettering) (11/16/88)
Newsgroups: comp.lang.scheme Subject: The Paradoxical Combinator -- Y Expires: References: Sender: Reply-To: markv@drizzle.UUCP (Mark VandeWettering) Followup-To: Distribution: world Organization: University of Oregon, Computer Science, Eugene OR Keywords: Alternatively entitled: "Y? Why Not?" :-) The discussion that has been going on in regards to the Y combinator as the basic operation in implementing recursive functions are interesting. The practical tests that people have made have shown that the Y combinator is orders of magnitude slower for implementing recursion than directly compiling it. This is true for Scheme. I hold that for an interesting set of languages, (lazy languages) that this result will not necessarily hold. The problem with Y isn't its complexity, it is the fact that it is an inherently lazy operation. Any implementation in Scheme is clouded by the fact that Scheme is an applicative order evaluator, while Y prefers to be evaluated in normal order. (define Y (lambda (g) ((lambda (h) (g (lambda (x) ((h h) x)))) (lambda (h) (g (lambda (x) ((h h) x))))))) (define fact (lambda (f) (lambda (n) (if (= n 1) 1 (* n (f (- n 1))))))) Evaluating (Y fact) 2 results in the following operations in Scheme: The argument is (trivially) evaluated, and returns two. (Y fact) must be evaluated. What is it? Y and fact each evaluate to closures. When applied, Y binds g to fact, and executes the body. The body is an application of a closure to another closure. The operator binds h to the operand, and executes its body which.... Evaluates (g (lambda (x) ((h h) x))). The operand is a closure, which gets built and then returns. g evaluates to fact. We substitute the closure (lambda (x) ((h h) x)) in for the function f in the definition of fact, giving... (lambda (n) (if (= n 1) 1 (* n ((lambda (x) ((h h) x)) (- n 1))))) Which we return as the value of (Y fact). When we apply this to 2, we get (* 2 ((lambda (x) ((h h) x)) 1)) We then have to evaluate ((lambda (x) ((h h) x)) 1) or ((h h) 1) But remembering that h was (lambda (h) (g (lambda (x) ((h h) x)))), we have (((lambda (h) (g (lambda (x) ((h h) x)))) (lambda (h) (g (lambda (x) ((h h) x))))) 1) .... So, we rebind h to be the right stuff, and evaluate the body, which is ((g (lambda (x) ((h h) x))) 1) Which by the definition of g (still == fact) is just 1. (* 2 1) = 2. ######################################################################## Summary: If you didn't follow this, performing this evaluation was cumbersome at best. As far as compiler or interpreter is concerned, the high cost of evaluating this function is related to two different aspects: It is necessary to create "suspended" values. These suspended values are represented as closures, which are in general heap allocated and expensive. For every level of recursion, new closures are created (h gets rebound above). While this could probably be optimized out by a smart compiler, it does seem like the representation of suspended evaluation by lambdas is inefficient. ######################################################################## You can try to figure out how all this works. It is complicated, I believe I understand it. The point in the derivation above is that in Scheme, to understand how the implementation of Y works, you have to fall back on the evaluation mechanism of Scheme. Suspended values must be represented as closures. It is the creation of these closures that cause the Scheme implementation to be slow. If one wishes to abandon Scheme (or at least applicative order evaluators of Scheme) one can typically do much better. My thesis work is in graph reduction, and trying to understand better the issues having to do with implementation. In graph reduction, all data items (evaluated and unevaluated) have the same representation: as graphs in the heap. We choose to evaluate using an outermost, leftmost strategy. This allows the natural definition of (Y h) = (h (Y h)) to be used. An application node of the form: @ / \ / \ Y h can be constructed in the obvious way: @ / \ / \ h @ / \ / \ Y h costing one heap allocation per level of recursion, which is certainly cheaper than the multiple allocations of scheme closures above. More efficiently, we might choose to implement it using a "knot tying" version: /\ / \ @ | / \ / / \/ h Which also works quite well. Y has been eliminated, and will cause no more reductions. The basic idea is somehow that recursion in functional languages is analogous to cycles in the graph in a graph reduction engine. Therefore, the Y combinator is a specific "textual" indicator of the graph. The G-machine (excellently described in Peyton Jones' book "The Implementation of Functional Programming Languages") also described the Y combinator as being efficient. He chose letrecs as being a primitive in the extended lambda calculus. His methodology behind compiling these recursive definitions was basically to compile fixed code which directly built these cyclic structures, rather than having them built at runtime. I think (and my thesis work is evolving into this kind of argument) that Y is overlooked for trivial reasons. Partial evaluation and smarter code generation could make an SK based compiler generate code which is equal in quality to that produced by supercombinator based compilation. This is too long already, ciao for now. Mark VandeWettering