markf@ZURICH.AI.MIT.EDU (09/23/89)
Sometime in the middle past (i.e. more than six months ago and less than two years ago) someone posted a nice derivation of the Y combinator. He or she took a recursive definition of factorial (I think) and through a series of abstractions derived Y. I thought I saved this little gem but I can't find it. I also can't find it in the scheme archives here at MIT. If someone else saved this or if the author reads this I would appreciate it if they could send me a copy. Thanks. Mark Friedman MIT Artificial Intelligence Lab 545 Technology Sq. Cambridge, Ma. 02139 markf@zurich.ai.mit.edu
chewy@apple.com (Paul Snively) (09/23/89)
In article <8909222102.AA19330@zurich.ai.mit.edu> markf@ZURICH.AI.MIT.EDU writes: > Sometime in the middle past (i.e. more than six months ago and less > than two years ago) someone posted a nice derivation of the Y > combinator. He or she took a recursive definition of factorial (I > think) and through a series of abstractions derived Y. This sounds suspiciously like the derivation of the applicative-order Y-combinator that appears in "The Little Lisper," either the trade edition or the third edition of which is STILL the standard way to turn someone on to Scheme, in my book, especially if they're not much into computer science (if they're already into computer science, I'll forget the quick kiss of "The Little Lisper" and move right on into pure seduction with "Structure and Interpretation of Computer Programs.") __________________________________________________________________________ Just because I work for Apple Computer, Inc. doesn't mean that they believe what I believe or vice-versa. __________________________________________________________________________
jozwiak@uicsrd.csrd.uiuc.edu (09/24/89)
You may want to consult: "Denotational Semantics-A Methodology for Language Development" by David A. Schmidt, ISBN 0-697-06849-8. I think that the Y combinator is: Y = lambda h.(lambda x. h(x x)) (lambda x. h(x x)) . I think that this does what it should is pretty patent from it, but as for deriving it, this involves insight that I don't yet have. john
pierce@lanai.cs.ucla.edu (Brad Pierce) (09/25/89)
In article <16800018@uicsrd.csrd.uiuc.edu> jozwiak@uicsrd.csrd.uiuc.edu writes: > >You may want to consult: > "Denotational Semantics-A Methodology for Language Development" >by David A. Schmidt, ISBN 0-697-06849-8. > I think that the Y combinator is: > > Y = lambda h.(lambda x. h(x x)) > (lambda x. h(x x)) . > > I think that this does what it should is pretty patent from it, but >as for deriving it, this involves insight that I don't yet have. > > john I think the previous poster was asking for a derivation of the *applicative-order* Y combinator, not the normal-order as given here. ``The Little Lisper'' (as someone else mentioned) has a good presentation, but its complexity points out the price pays for using an applicative-order language. In normal-order languages, like standard lambda calculus, the derivation is straightforward. The motivation for the Y-combinator in lambda calculus is to allow the use of recursive definitions. Recall that there is no such thing as assignment in lambda calculus, but that this elegant and mathematically tractable model of computation is rich enough to support recursion anyway. The standard example goes as follows: Define the factorial function FACT(n) = n! as FACT (n) = lambda n. if (n == 0) 1 else n*FACT(n-1) Then abstracting out the FACT, define: H = lambda fact. lambda n. if (n == 0) 1 else n*fact(n-1) Then we have the equation: FACT = H(FACT) We want to solve for FACT. Since the only other factor is the function H, we look for a function that will take H as input and return the solution FACT. Said another way, we're looking for a function, F, that satisfies the following equation: FH = H(FH) If combinator theory is defined in terms of the lambda calculus, then we can define combinators as lambda expressions without free variables. For obvious reasons, a combinator with the kind of property described above is called a fixed-point combinator. It is possible to define an infinite number of different fixed- point combinators in the lambda calculus, but the simplest is the so-called Y-combinator given above. It is easy to verify the definition, as long as you remember that it is to be evaluated in normal-order. Finding a fixed-point combinator is not so easy when using applicative-order evaluation. Please see the ``Little Lisper''. Maybe it's time to start thinking seriously about whether the current performance benefits of applicative-order evaluation is worth the theoretical and human factors price tag. -- Brad
marshall@iuvax.cs.indiana.edu (Jim Marshall) (09/25/89)
Hello world. Someone recently requested information about deriving the applicative-order Y-combinator from scratch, so I thought I would post a derivation of it in Scheme. The following derivation is similar in flavor to the derivation found in "The Little LISPer" by Friedman/Felleisen, but uses a slightly different starting approach (for one thing, I begin with the "length" function). Maybe this version will be a little easier to follow (maybe not, who knows?). Enjoy... Step 0... We wish to write the recursive function "length" without having to give it a name. Then we could invoke it on a list as follows: ((lambda (l) (if (null? l) 0 (add1 (??? (cdr l))))) '(any old list you like)) The problem, of course, is that we can't plug in the function expression itself directly in place of the ??? because that immediately leads to an infinite regress. It's like trying to quote an entire sentence inside the sentence itself. ---------------------------------------------------------------------- Step 1... This is the _only_ step in the entire derivation that requires any real thought. We can get around the above problem by passing in a _copy_ of the length function as an extra argument and then using _that_ in place of the ???. But we must ensure that the copy of our function looks _exactly_ like the function itself at all times. WE WILL ADHERE TO THIS REQUIREMENT IN EVERY STEP THAT FOLLOWS. Notice that since f is a copy of the function, and the function takes a copy of itself as a second argument, we must also pass f to f (as a second argument). Passing a copy of the function to itself is the whole secret of the Y-combinator. The following expression will evaluate to 5 (the length of the example list): ((lambda (l f) (if (null? l) 0 (add1 (f (cdr l) f)))) '(any old list you like) (lambda (l f) (if (null? l) 0 (add1 (f (cdr l) f))))) ------------------------------------------------------------------------ Step 2... We just switch the order of the function arguments, l and f. "(f (cdr l) f)" changes to "(f f (cdr l))", and the arguments to the top-level invocation also switch places: ((lambda (f l) (if (null? l) 0 (add1 (f f (cdr l))))) (lambda (f l) (if (null? l) 0 (add1 (f f (cdr l))))) '(any old list you like)) ------------------------------------------------------------------------ Step 3... We simply curry the function so that it takes its two arguments one at a time. Note the extra set of ()'s around the top-level invocation as well. This expression still evaluates to 5: (((lambda (f) (lambda (l) (if (null? l) 0 (add1 ((f f) (cdr l)))))) (lambda (f) (lambda (l) (if (null? l) 0 (add1 ((f f) (cdr l))))))) '(any old list you like)) ------------------------------------------------------------------------- Step 4... The above expression is now of the form "(<function> '(any old list you like))", where <function> is now a self-contained recursive version of "length", although still in a clumsy form. We can forget about '(any old list you like) for the remainder of the derivation and concentrate on just the <function> part, since that's what we're interested in. So here it is by itself: ((lambda (f) (lambda (l) (if (null? l) 0 (add1 ((f f) (cdr l)))))) (lambda (f) (lambda (l) (if (null? l) 0 (add1 ((f f) (cdr l))))))) ------------------------------------------------------------------------- Step 5... Notice that in the above expression (f f) returns a function which gets applied to (cdr l). In the same way that the add1 function is equivalent to the function (lambda (a) (add1 a)), the "(f f) function" is equivalent to the function (lambda (a) ((f f) a)). [This is just an inverse eta step to you lambda-calculus pros out there]. This step is necessary to avoid infinite loops, since we're assuming applicative order (i.e, Scheme). ((lambda (f) (lambda (l) (if (null? l) 0 (add1 ((lambda (a) ((f f) a)) (cdr l)))))) (lambda (f) (lambda (l) (if (null? l) 0 (add1 ((lambda (a) ((f f) a)) (cdr l))))))) ------------------------------------------------------------------------- Step 6... Here we just give the (lambda (a) ((f f) a)) function the name "r" using a let-expression. Simple. (Notice how every change to our function requires an identical change to the _copy_ of the function, as mentioned earlier). ((lambda (f) (let ([r (lambda (a) ((f f) a))]) (lambda (l) (if (null? l) 0 (add1 (r (cdr l))))))) (lambda (f) (let ([r (lambda (a) ((f f) a))]) (lambda (l) (if (null? l) 0 (add1 (r (cdr l)))))))) ------------------------------------------------------------------------- Step 7... Now we just expand the let-expressions into their equivalent lambda-forms. In general, "(let ([x val]) body)" is equivalent to "((lambda (x) body) val)". This may look complicated but it's not. Just match up the ()'s carefully. ((lambda (f) ((lambda (r) (lambda (l) (if (null? l) 0 (add1 (r (cdr l)))))) (lambda (a) ((f f) a)))) (lambda (f) ((lambda (r) (lambda (l) (if (null? l) 0 (add1 (r (cdr l)))))) (lambda (a) ((f f) a))))) ------------------------------------------------------------------------- Step 8... Now we can give the (lambda (r) (lambda (l) ...)) expression a name also ("m") using a let-expression, since it has no free variables (except for primitives, but they're bound globally anyway). This step is just like Step 6. (let ([m (lambda (r) (lambda (l) (if (null? l) 0 (add1 (r (cdr l))))))]) ((lambda (f) (m (lambda (a) ((f f) a)))) (lambda (f) (m (lambda (a) ((f f) a)))))) ------------------------------------------------------------------------- Step 9... Now we replace the let-expression for "m" by its equivalent lambda-form, just like in Step 7, and out pops the applicative-order Y-combinator! The expression below still represents the self-contained recursive length function, but now it's in a nicer form. In particular, the (lambda (m) ...) sub-expression is Y: ((lambda (m) ((lambda (f) (m (lambda (a) ((f f) a)))) (lambda (f) (m (lambda (a) ((f f) a)))))) (lambda (r) (lambda (l) (if (null? l) 0 (add1 (r (cdr l))))))) ------------------------------------------------------------------------ Step 10... We just pull out the (lambda (m) ...) sub-expression and call it Y, since all of its variables are bound (after all, it's a combinator). Then the expression above for the recursive length function can be rewritten as shown below. The expression passed to Y is a "template" for the recursive length function. Instead of "???", we call the recursive invocation "r", wrap the whole thing with (lambda (r) ...), and hand it over to Y, which returns a self-contained recursive function. You can give it a name with define if you want, but you don't have to. (define Y (lambda (m) ((lambda (f) (m (lambda (a) ((f f) a)))) (lambda (f) (m (lambda (a) ((f f) a))))))) (Y (lambda (r) (lambda (l) (if (null? l) 0 (add1 (r (cdr l))))))) ------------------------------------------------------------------------- Jim Marshall, Gradual Student | "That it will never come again Computer Science Department | Is what makes life so sweet." Indiana University, Bloomington | -Emily Dickinson marshall@iuvax.cs.indiana.edu | -------------------------------------------------------------------------
dorai@titan.rice.edu (Dorai Sitaram) (09/29/89)
In article <27390@shemp.CS.UCLA.EDU> pierce@lanai.UUCP (Brad Pierce) writes:
$
$I think the previous poster was asking for a derivation of the
$*applicative-order* Y combinator, not the normal-order as given
$here. ``The Little Lisper'' (as someone else mentioned) has a
$good presentation, but its complexity points out the price pays
$for using an applicative-order language.
$
$In normal-order languages, like standard lambda calculus, the
$derivation is straightforward. The motivation for the Y-combinator
$in lambda calculus is to allow the use of recursive definitions.
$Recall that there is no such thing as assignment in lambda calculus,
$but that this elegant and mathematically tractable model of
$computation is rich enough to support recursion anyway.
$
$The standard example goes as follows:
$
$ [A quite proper description of normal-order Y deleted.]
$
$is the so-called Y-combinator given above. It is easy to verify
$the definition, as long as you remember that it is to be
$evaluated in normal-order. Finding a fixed-point combinator
$is not so easy when using applicative-order evaluation. Please see
$the ``Little Lisper''.
$
$Maybe it's time to start thinking seriously about whether the
$current performance benefits of applicative-order evaluation
$is worth the theoretical and human factors price tag.
I find your _conclusions_ (ie, not the body of your posting)
astounding. The applicative-order Y is not all that prohibitively more
complex than the normal-order one. One only has to replace in
$> Y = lambda h.(lambda x. h(x x))
$> (lambda x. h(x x)) .
both occurrences of (x x) by (lambda z. (x x) z). Note also that the
new Y shares the elegance and lack of assignment that you claim for
the old Y.
Re the current acceptance of applicative-order evaluation in the
world's programming languages, I hardly think it can be attributed
solely to performance benefits. Applicative-order is _understandable_
in the presence of side-effects in a way that normal-order fails to
be. You may protest that assignment-type side-effects should be
eliminated from the language, but then things like input and output
cannot be totally wished away.
Assuming you have a 'let' defined in the usual way in terms of lambda,
we have, in a potential normal-order real-life programming language
with input/output, among other things,
(let ([y (printf "y~n")])
((lambda (x) t) y))
never printing y at all;
(let ([y (printf "y~n")])
(or y y))
printing y twice (assuming printf returns nil);
whereas in applicative order, y is just printed once (as expected,
IMHO), regardless of the body of the let.
--dorai
-------------------------------------------------------------------------------
Alles ist richtig, auch das Gegenteil. Nur: zwar ... aber -- das ist
nie richtig. --Kurt Tucholsky
-------------------------------------------------------------------------------