[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Y combinator derivation (long)



Hello world.  Someone recently requested information about deriving the
applicative-order Y-combinator from scratch, so I thought I would post a
derivation of it in Scheme.  The following derivation is similar in flavor
to the derivation found in "The Little LISPer" by Friedman/Felleisen, but
uses a slightly different starting approach (for one thing, I begin with
the "length" function).  Maybe this version will be a little easier to
follow (maybe not, who knows?).  Enjoy...

Step 0...  We wish to write the recursive function "length" without
having to give it a name.  Then we could invoke it on a list as follows:

((lambda (l)
   (if (null? l)
       0
       (add1 (??? (cdr l)))))
 '(any old list you like))

The problem, of course, is that we can't plug in the function
expression itself directly in place of the ??? because that
immediately leads to an infinite regress.  It's like trying to quote
an entire sentence inside the sentence itself.

----------------------------------------------------------------------

Step 1...  This is the _only_ step in the entire derivation that
requires any real thought.  We can get around the above problem by
passing in a _copy_ of the length function as an extra argument and
then using _that_ in place of the ???.  But we must ensure that the
copy of our function looks _exactly_ like the function itself at all
times.  WE WILL ADHERE TO THIS REQUIREMENT IN EVERY STEP THAT FOLLOWS.
Notice that since f is a copy of the function, and the function takes
a copy of itself as a second argument, we must also pass f to f (as a
second argument).  Passing a copy of the function to itself is the
whole secret of the Y-combinator.  The following expression will
evaluate to 5 (the length of the example list):

((lambda (l f)
   (if (null? l)
       0
       (add1 (f (cdr l) f))))
 '(any old list you like)
 (lambda (l f)
   (if (null? l)
       0
       (add1 (f (cdr l) f)))))

------------------------------------------------------------------------

Step 2...  We just switch the order of the function arguments, l and
f.  "(f (cdr l) f)" changes to "(f f (cdr l))", and the arguments to
the top-level invocation also switch places:

((lambda (f l)
   (if (null? l)
       0
       (add1 (f f (cdr l)))))
 (lambda (f l)
   (if (null? l)
       0
       (add1 (f f (cdr l)))))
 '(any old list you like))

------------------------------------------------------------------------

Step 3...  We simply curry the function so that it takes its two
arguments one at a time.  Note the extra set of ()'s around the
top-level invocation as well.  This expression still evaluates to 5:

(((lambda (f)
    (lambda (l)
      (if (null? l)
	  0
	  (add1 ((f f) (cdr l))))))
  (lambda (f)
    (lambda (l)
      (if (null? l)
	  0
	  (add1 ((f f) (cdr l)))))))
 '(any old list you like))

-------------------------------------------------------------------------

Step 4...  The above expression is now of the form
"(<function> '(any old list you like))", where <function> is now a
self-contained recursive version of "length", although still in a
clumsy form.  We can forget about '(any old list you like) for the
remainder of the derivation and concentrate on just the <function>
part, since that's what we're interested in.  So here it is by itself:

((lambda (f)
   (lambda (l)
     (if (null? l)
	 0
	 (add1 ((f f) (cdr l))))))
 (lambda (f)
   (lambda (l)
     (if (null? l)
	 0
	 (add1 ((f f) (cdr l)))))))

-------------------------------------------------------------------------

Step 5...  Notice that in the above expression (f f) returns a
function which gets applied to (cdr l).  In the same way that the add1
function is equivalent to the function (lambda (a) (add1 a)), the
"(f f) function" is equivalent to the function (lambda (a) ((f f) a)).
[This is just an inverse eta step to you lambda-calculus pros out
there].  This step is necessary to avoid infinite loops, since we're
assuming applicative order (i.e, Scheme).

((lambda (f)
   (lambda (l)
     (if (null? l)
	 0
	 (add1 ((lambda (a) ((f f) a)) (cdr l))))))
 (lambda (f)
   (lambda (l)
     (if (null? l)
	 0
	 (add1 ((lambda (a) ((f f) a)) (cdr l)))))))

-------------------------------------------------------------------------

Step 6...  Here we just give the (lambda (a) ((f f) a)) function the name
"r" using a let-expression.  Simple.  (Notice how every change to our
function requires an identical change to the _copy_ of the function, as
mentioned earlier).

((lambda (f)
   (let ([r (lambda (a) ((f f) a))])
     (lambda (l)
       (if (null? l)
	   0
	   (add1 (r (cdr l)))))))
 (lambda (f)
   (let ([r (lambda (a) ((f f) a))])
     (lambda (l)
       (if (null? l)
	   0
	   (add1 (r (cdr l))))))))

-------------------------------------------------------------------------

Step 7...  Now we just expand the let-expressions into their equivalent
lambda-forms.  In general, "(let ([x val]) body)" is equivalent to
"((lambda (x) body) val)".  This may look complicated but it's not.
Just match up the ()'s carefully.

((lambda (f)
   ((lambda (r)
      (lambda (l)
	(if (null? l)
	    0
	    (add1 (r (cdr l))))))
    (lambda (a) ((f f) a))))
 (lambda (f)
   ((lambda (r)
      (lambda (l)
	(if (null? l)
	    0
	    (add1 (r (cdr l))))))
    (lambda (a) ((f f) a)))))

-------------------------------------------------------------------------

Step 8...  Now we can give the (lambda (r) (lambda (l) ...))
expression a name also ("m") using a let-expression, since it has
no free variables (except for primitives, but they're bound globally
anyway).  This step is just like Step 6.

(let ([m (lambda (r)
	   (lambda (l)
	     (if (null? l)
		 0
		 (add1 (r (cdr l))))))])
  ((lambda (f)
     (m (lambda (a) ((f f) a))))
   (lambda (f)
     (m (lambda (a) ((f f) a))))))

-------------------------------------------------------------------------

Step 9...  Now we replace the let-expression for "m" by its equivalent
lambda-form, just like in Step 7, and out pops the applicative-order
Y-combinator!  The expression below still represents the self-contained
recursive length function, but now it's in a nicer form.  In particular,
the (lambda (m) ...) sub-expression is Y:

((lambda (m)
   ((lambda (f) (m (lambda (a) ((f f) a))))
    (lambda (f) (m (lambda (a) ((f f) a))))))
 (lambda (r)
   (lambda (l)
     (if (null? l)
	 0
	 (add1 (r (cdr l)))))))

------------------------------------------------------------------------

Step 10...  We just pull out the (lambda (m) ...) sub-expression and
call it Y, since all of its variables are bound (after all, it's a
combinator).  Then the expression above for the recursive length function
can be rewritten as shown below.  The expression passed to Y is a
"template" for the recursive length function.  Instead of "???", we call
the recursive invocation "r", wrap the whole thing with (lambda (r) ...),
and hand it over to Y, which returns a self-contained recursive function.
You can give it a name with define if you want, but you don't have to.

(define Y
  (lambda (m)
    ((lambda (f) (m (lambda (a) ((f f) a))))
     (lambda (f) (m (lambda (a) ((f f) a)))))))

(Y (lambda (r)
     (lambda (l)
       (if (null? l)
	   0
	   (add1 (r (cdr l)))))))

-------------------------------------------------------------------------
Jim Marshall, Gradual Student      |  "That it will never come again
Computer Science Department        |   Is what makes life so sweet."
Indiana University, Bloomington    |                 -Emily Dickinson
marshall@iuvax.cs.indiana.edu      |
-------------------------------------------------------------------------