[comp.lang.scheme] Y, again

jinx@CHAMARTIN.AI.MIT.EDU (Guillermo J. Rozas) (12/15/88)

The recent flurry of messages about the Y operator has prompted me to
spend yet a little more time on the MIT Scheme compiler.  The most
recent version (4.33) does a good job on some versions of the Y
operator, but not on all.  This message should clarify what it can
currently do and what it cannot.

I should point out that there is no explicit "Y-recognizer" in the
compiler, but rather the results are a consequence of various of its
parts:
 - value analysis (which values can reach what variables).  In
particular, a variable's value is "known" if only one value can reach
it.
 - closure analysis (which lambda expressions need to be represented
by actual objects at run time, instead of having an implicit
representation in the code stream)
 - a limited side effect analysis which allows the compiler to "punt"
some calls when the callee has no side effects and the value returned
is unused or known and available at the call point.  This is the piece
of code that I've written recently to make the examples below "work".

Some of my thoughts about the points raised in recent messages:

As far as using some version of Y to implement LETREC, we could
probably start doing it and users would hardly ever notice the
difference, but aside from conceptual elegance nothing would be
gained, so it's not really worth it in a "real" compiler.

Why have I spent any energy getting the Y operator to "work"?  From a
compiler writer's point of view, I'm not interested in the Y operator
per-se, except as a "cute" hack.  It just happens to be a very good
test case for value and closure analysis, and if the compiler can do a
good job on code that uses it, it will hopefully also do a good job on
the "simpler" cases that occur in practice.

About eq?/eqv?-ness of procedures, I should say that the only time we
may need to distinguish multiple procedures whose "code" is the same
lambda expression is when they can coexist in time, and free variables
can cause the different instances to behave differently (return
different values and/or perform different effects).  Clearly, any
correct compiler cannot "merge" two such instances.  The rest of the
cases are for philosophers to argue about.  To make them happy,
compiler writers may have to provide a "pedantic" switch in the
compiler to prevent merging of otherwise identical procedures.  I'm
not saying that determining when two procedures "act the same way" is
feasible (we all know about the halting theorem), but if we are able
to determine in some instance that the only thing that could
discriminate between two procedures is eq?/eqv?, there is really no
reason not to merge.

Here are some examples of what the MIT Scheme compiler can currently do:

The following versions of factorial turn into identical (bitwise) code:

;; version 1, using letrec.  reference version

(define fact
  (letrec ((fact (lambda (n)
		   (if (= n 0)
		       1
		       (* n (fact (- n 1)))))))
    fact))

;; version 2, "subroutinized" "standard" applicative order

(define fact
  ((lambda (f)
     (let ((g (lambda (g) (f (lambda () (g g))))))
       (g g)))
   (lambda (fg)
     (lambda (n)
       (if (= n 0)
	   1
	   (* n ((fg) (- n 1))))))))

;; version 3, "standard" normal order

(define fact
  ((lambda (f)
     ((lambda (g) (f (g g)))
      (lambda (g) (f (g g)))))
   (lambda (fact)
     (lambda (n)
       (if (= n 0)
	   1
	   (* n (fact (- n 1))))))))

Note that version 3 "has no right to work" in Scheme, but I've never
found anything wrong with compilers that "fix" users' code.

The following two versions turn into code essentially identical to the
code generated for the versions above, with the differences described
below.

;; version 4, another "subroutinized" "standard" applicative order

(define fact
  ((lambda (f)
     (let ((g (lambda (g) (f (lambda (x) ((g g) x))))))
       (g g)))
   (lambda (fact)
     (lambda (n)
       (if (= n 0)
	   1
	   (* n (fact (- n 1))))))))

;; version 5, "inline coded" Y

(define fact
  (let ((kernel
	 (lambda (p n)
	   (if (= n 0)
	       1
	       (* n (p p (- n 1)))))))
    (lambda (n)
      (kernel kernel n))))

The code generated for version 4 differs from the code in versions 1-3
only in the order in which the basic blocks appear in the linearized
output.

The code for version 5 differs because the compiler is currently not
smart enough to eta convert (lambda (n) ...) into (lambda (p n) ...)
after it determines that p in (lambda (p n) ...)  is used for "self
reference" and unneeded (and dropped).  Thus there is an initial call
(branch instruction) from the code generated for (lambda (n) ...) to
the code generated for (lambda (p n) ...), but the actual "recursive"
procedure is coded identically to the other cases above.

As an example of a hairier case where the compiler "wins", it
generates identical code for

(define (delq! item items)
  (letrec ((trim-initial-segment
	    (lambda (items)
	      (if (pair? items)
		  (if (eq? item (car items))
		      (trim-initial-segment (cdr items))
		      (begin (locate-initial-segment items (cdr items))
			     items))
		  items)))
	   (locate-initial-segment
	    (lambda (last this)
	      (if (pair? this)
		  (if (eq? item (car this))
		      (set-cdr! last (trim-initial-segment (cdr this)))
		      (locate-initial-segment this (cdr this)))
		  this))))
    (trim-initial-segment items)))

and

(define delq!
  (lambda (item items)
    (((lambda (b f g)
	((lambda (k1 k2)
	   (b (lambda () (k1 k1 k2))
	      (lambda () (k2 k1 k2))))
	 (lambda (k1 k2)
	   (f (lambda () (k1 k1 k2))
	      (lambda () (k2 k1 k2))))
	 (lambda (k1 k2)
	   (g (lambda () (k1 k1 k2))
	      (lambda () (k2 k1 k2))))))
      (lambda (p1g p2g)
	(lambda ()
	  ((p1g) items)))
      (lambda (p11g p12g)
	(lambda (items)
	  (if (pair? items)
	      (if (eq? item (car items))
		  ((p11g) (cdr items))
		  (begin ((p12g) items (cdr items))
			 items))
	      items)))
      (lambda (p21g p22g)
	(lambda (last this)
	  (if (pair? this)
	      (if (eq? item (car this))
		  (set-cdr! last ((p21g) (cdr this)))
		  ((p22g) this (cdr this)))
	      this)))))))

Examples where the compiler "loses":

;; version 6 "non-subroutinized" "standard" applicative order

(define fact
  ((lambda (f)
     ((lambda (g) (f (lambda () (g g))))
      (lambda (g) (f (lambda () (g g))))))
   (lambda (fg)
     (lambda (n)
       (if (= n 0)
	   1
	   (* n ((fg) (- n 1))))))))

;; version 7 another "non-subroutinized" "standard" applicative order

(define fact
  ((lambda (f)
     ((lambda (g) (f (lambda (x) ((g g) x))))
      (lambda (g) (f (lambda (x) ((g g) x))))))
   (lambda (fact)
     (lambda (n)
       (if (= n 0)
	   1
	   (* n (fact (- n 1))))))))

Version 6 does not win because the compiler is not smart enough to
realize that both copies of (lambda () (g g)) are indistinguishable.
This causes (lambda (n) ...) to be represented as a closure (with one
free variable, namely fg), and even though the compiler realizes that
the call (fg) always returns a version of (lambda (n) ...) and
therefore codes the call as a direct branch, there is still overhead
associated with the closure and computing its "next" version (which
will be unused).

Version 7 does not win for a similar reason.  Again, the compiler does
not realize that both copies of (lambda (x) ...) are identical and
therefore causes (lambda (n) ...) to be closed over fact.  Both calls
((g g) x) are known to be calls to (lambda (n) ...), and are
translated as direct branches after computing the "next" (useless)
closure for (lambda (n) ...)

There are many other cases where it will lose due to the
"non-uniqueness" of the values which may reach individual variables.

An altogether different reason why it may lose is the fact that once a
closure is detected, a "pessimistic" code generation strategy is used
which forces some of the "useless" procedures and overhead to
reappear.

max@polya.Stanford.EDU (Max Hailperin) (12/16/88)

I think I may have just been tarred as a philospher or as pedantic --
ouch!  So let me try to clarify what I see to be some of the more
substantive issues:

1) I didn't mean to imply that Rozas's compiler actually has a Y
   recognizer; my point was merely that it wasn't surprising that it
   could do those things a Y recognizer could.  The question is, how
   does it hold up in more difficult cases?  One case I'd very much
   like Rozas to compare the generated code for is Y itself, *not
   applied to anything*.  E.g., how similar/different is the code for
   these two procedures:
     (define (Y1 f)
       (let ((g (lambda (g)
                  (f (lambda (x) ((g g) x))))))
         (g g)))

     (define (Y2 f)
       (letrec ((y-of-f (f (lambda (x) (y-of-f x)))))
         y-of-f))

2) I wasn't weighing the possibility of a compiler defining letrec in
   terms of Y, I was weighing the possibility of the language standard
   defining letrec in terms of Y.  While users of Rozas's compiler
   would notice no difference if he were to do so, users of other less
   optimizing compilers would notice a difference, namely in
   "self"-eqvness.  The issue isn't excess eqvnes, either, as Rozas
   seems to think, but rather insufficient eqvness.  As long as scheme
   supports procedural objects with identities, letrec will have to
   remain defined in terms of set! rather than Y.

matthias@leto.rice.edu (Matthias Felleisen) (12/17/88)

In article <5684@polya.Stanford.EDU> mxh@sumex-aim.Stanford.EDU (Max Hailperin) writes:
>2) I wasn't weighing the possibility of a compiler defining letrec in
>   terms of Y, I was weighing the possibility of the language standard
>   defining letrec in terms of Y.  While users of Rozas's compiler
>   would notice no difference if he were to do so, users of other less
>   optimizing compilers would notice a difference, namely in
>   "self"-eqvness.  The issue isn't excess eqvnes, either, as Rozas
>   seems to think, but rather insufficient eqvness.  As long as scheme
>   supports procedural objects with identities, letrec will have to
>   remain defined in terms of set! rather than Y.

It is not the R^nRS-specification of eqv? [pp13-14] that requires 
	(letrec ([f (lambda (x) (eqv? f f))]) (f 0)) 
to yield #t but the expansion-specification of letrec [p35]. If we
changed the letrec-specification, the compiler could return any
arbitrary value [recall: eqv? is an approximation of operational
equivalence].

But efficiency is not really the important issue. The assignability of
function variables, in particular for recursive functions, requires
the current letrec-specification. The importance of this cannot be
overemphasized. Dan Friedman and I have shown how to use this trick
for define-by-need and import-by-need in a module environment, and Dan
and John Franco have recently written up a tech rpt on stream
techniques that use this trick.

-- Matthias

gateley@m2.csc.ti.com (John Gateley) (12/17/88)

In article <8812151447.AA06672@chamartin.ai.mit.edu> jinx@zurich.ai.mit.edu writes:
<[Lots of stuff about Y]
<;; version 3, "standard" normal order
<
<(define fact
<  ((lambda (f)
<     ((lambda (g) (f (g g)))
<      (lambda (g) (f (g g)))))
<   (lambda (fact)
<     (lambda (n)
<       (if (= n 0)
<	   1
<	   (* n (fact (- n 1))))))))
<
<Note that version 3 "has no right to work" in Scheme, but I've never
<found anything wrong with compilers that "fix" users' code.

If I understand correctly, then you are saying that this terminates (and
produces a factorial function) with your compiler. This is quite 
interesting! Does the compiler eliminate infinite loops all the time?
Or is it just when the loop terminates in normal order, but not in
applicative order? Has anybody complained because the compiler took
their loop out?

What about the semantics: Does Scheme require infinite loops to be generated?

John
gateley@tilde.csc.ti.com